According to Dongcha Beating monitoring, Zhang Chi, former ByteDance Seed Team engineer and current assistant professor at Peking University, revealed in the "Into Asia" podcast that ByteDance takes about half a year to complete a large-scale model training (pre-training plus post-training), while Google is rumored to only need three months. He believes that the iteration speed is a core reason why Chinese companies find it difficult to catch up. Zhang Chi spent about a year at ByteDance, where the math team he was part of was more research-oriented. He described the team's focus as "more for publicity," different from the pre-training and post-training teams responsible for model delivery.
Zhang Chi described the benchmaxxing (score boosting) culture within Seed: team leaders evaluate performance based on the benchmarks they are responsible for, and everyone is trying to boost their scores. However, he said, "this cannot translate into a good user experience in practice." He mentioned that on paper, the models of large Chinese companies can match the cutting-edge models in the United States, but in practice, they are "not good enough." Seed aimed to be globally leading, but, he expressed, "Unfortunately, I don't think we have caught up," even though the goal of being the top in the country "has not been achieved." By the end of 2024, Seed deemed itself on par with GPT-4o, then DeepSeek was released, and the team realized that the gap still existed. At that time, the entire team was urgently shifting focus to reinforcement learning.
