NewsFlash Articles Data Fundraising Skill&API

The world's first AI-authored pre-training framework has been open-sourced. Tsinghua University and Wall of Faces have jointly launched ForgeTrain.

According to Perceptual Beating monitoring, MindWall AI and Tsinghua NLP Lab jointly open-sourced the world's first AI-written production-level large-scale model pre-training framework, ForgeTrain, in the OpenBMB community. They also released the MiniCPM5-1B edge-side small model trained by ForgeTrain. As the first demonstration of an "AI creating AI" engineering closed loop, ForgeTrain outperformed Nvidia's Megatron under the same hardware conditions and achieved a 10% acceleration in pre-training on Huawei Ascend. At the same time, MiniCPM5-1B topped the Artificial Analysis open-weight small model leaderboard.

To enable AI to autonomously build underlying pre-training infrastructure, MindWall AI proposed the "Forge Engineering" software programming paradigm, abandoning a universal framework compatible with all hardware and tasks, and instead using AI's low-cost code generation capability to forge dedicated code for specific models and hardware on-site. In terms of construction mechanism, ForgeTrain adopts a three-stage approach: first, it collects key data from existing pre-training frameworks to form a test harness, then iteratively generates binary-consistent framework code in an automatic closed loop, and finally removes limitations to achieve surpassing the reference implementation. The entire automation evolution corresponds to the L3 to L4 stages of AI creating AI.

As the first output model of ForgeTrain, MiniCPM5-1B has 1.08 billion parameters, with its core architecture based on the standard LlamaForCausalLM design, significantly reducing the threshold for downstream integration and inference deployment. In the Artificial Analysis evaluation, the model scored 18 points, surpassing the 2B-scale Qwen3.5-2B (16 points) and leading Qwen3.5-0.8B (11 points) and LFM2.5-1.2B-Thinking (8 points). The model supports deployment formats such as MLX 4-bit and GGUF Q4_K_M, with weight after INT4 quantization only 0.5GB, and natively supports 131,072 token long-text context and enable_thinking-based hybrid dual-mode reasoning. Built on extremely low hardware overhead, OpenBMB also open-sourced the MiniCPM Desk Pet desktop widget companion app for purely offline operation, supporting real-time response to coding activities in development tools such as Cursor and LoRA persona switching.

Source

Correction/Report

On-Chain Activity