header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

Former ByteDance Engineer: ByteDance's engineering cycle takes half a year, while Google rumored to only need three months

According to Dongcha Beating monitoring, Zhang Chi, former ByteDance Seed Team engineer and current assistant professor at Peking University, revealed in the "Into Asia" podcast that ByteDance takes about half a year to complete a large-scale model training (pre-training plus post-training), while Google is rumored to only need three months. He believes that the iteration speed is a core reason why Chinese companies find it difficult to catch up. Zhang Chi spent about a year at ByteDance, where the math team he was part of was more research-oriented. He described the team's focus as "more for publicity," different from the pre-training and post-training teams responsible for model delivery.

Zhang Chi described the benchmaxxing (score boosting) culture within Seed: team leaders evaluate performance based on the benchmarks they are responsible for, and everyone is trying to boost their scores. However, he said, "this cannot translate into a good user experience in practice." He mentioned that on paper, the models of large Chinese companies can match the cutting-edge models in the United States, but in practice, they are "not good enough." Seed aimed to be globally leading, but, he expressed, "Unfortunately, I don't think we have caught up," even though the goal of being the top in the country "has not been achieved." By the end of 2024, Seed deemed itself on par with GPT-4o, then DeepSeek was released, and the team realized that the gap still existed. At that time, the entire team was urgently shifting focus to reinforcement learning.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish