NewsFlash Articles Data Fundraising Skill&API

Yao Shunyu Reveals Internal Development Process of Claude 3 Series for the First Time: Code Superiority Surpassing GPT Purely Serendipitous

According to Data Observation Beating's monitoring, Yao Shunyu, former Anthropic research scientist and current Google DeepMind research scientist, revealed for the first time the internal development process of Claude 3.7 in the podcast "Language is the World." After joining Anthropic in October 2024, he was assigned to a team called Horizon, which at the time consisted of only 10 to 11 people covering all aspects of reinforcement learning. Claude 3.7 took a total of four to five months from the start of research to final release, with the first two to three months spent on algorithm and data research and the last two months on training and infrastructure setup.

Anthropic did not initially plan to bet on code ability. Yao Shunyu disclosed that the reason Claude 3 outcodes GPT-4 is a purely technical reason that he cannot openly discuss, stemming from a bottom-up effort by a certain team. The overwhelming positive feedback on Twitter after Claude 3's release validated this advantage, leading Anthropic's management to upgrade code ability to a company-wide strategic focus. He believed that Anthropic was able to pivot quickly in this direction because the company's CTO, Jared Kaplan, and Sam McCandlish were also co-founders, able to both lead on the technical front and make executive decisions together. This was something that OpenAI could not achieve, where although Ilya might have had the authority during his tenure, he left once decision-making power was lost. At that time, Anthropic had almost no awareness in terms of product, with Claude 3.5 releasing two versions within six months under the same name, only reluctantly distinguished by the external nickname "3.6."

Note: In the AI field, there are two researchers with the same Pinyin name who are easily confused. The interviewee in this article, Yao Shunyu, received his undergraduate degree from Tsinghua University in Physics and a Ph.D. in Theoretical Physics from Stanford. He joined Anthropic in 2024 to participate in the reinforcement learning research for Claude 3.7 and the Claude 4 series, then switched to Google DeepMind in September 2025. The other researcher, Yao Shunyu, graduated from Tsinghua's Yao Class with a bachelor's degree and obtained a Ph.D. in Computer Science from Princeton. He proposed the Tree of Thoughts and ReAct frameworks, previously served as a researcher at OpenAI, and assumed the role of Chief AI Scientist at Tencent in December 2025. The two are alumni of Tsinghua from the same year.

Source

Correction/Report

On-Chain Activity