According to Dynamic Beating monitoring, Y Combinator partner Diana Hu pointed out on X that the future frontier lies not in simply scaling up parameters, but in building a thin software layer on top of base models, allowing AI to write its own problem-solving rules like a programmer (executable world model). AI can continuously test, modify, and streamline code based on runtime results, without the need for expensive fine-tuning of the large model itself.
The path of gradient-free code learning confirms the heuristic learning paradigm proposed by Weng Jiyi, a core member of OpenAI, last month. In traditional reinforcement learning, AI needs thousands of debug cycles to learn a task, forcefully cramming experiences into a neural network black box, consuming vast amounts of energy and prone to forgetfulness. In Weng Jiyi's experiment, without adjusting any parameters of the large model, relying solely on the large model to write Python code, find bugs, and set rules, the AI mastered the Atari Breakout game. This indicates that the vessel of knowledge can be a human-readable, testable code system, rather than an incomprehensible neural network weight.
For YC co-founder Paul Graham, the cycle of writing code, validating, and compressing is very close to a scientist's daily research. Large models do not need to restructure their brains but, like scientists, formulate hypothesis models with code for new environments, run code for validation experiments, and distill the most concise rules to solve problems. The process of finding the simplest program is also the ultimate standard for measuring the efficiency of artificial general intelligence (ARC-AGI).
The key dividend is that gradient-free learning can directly leverage the underlying large model's capacity enhancement. As the underlying large model becomes smarter, the code and strategies written by the agent will multiply in strength. Building upon Richard Sutton's famous Bitter Lesson, gradient-free code learning is sketching out a new S-curve. With the explosion of the large model's code capacity, the path of AI self-evolution is unfolding the curtain on the next generation of artificial intelligence paradigms.
