According to 1M AI News, today, Rakuten Group announced Rakuten AI 3.0 under the name of "Japan's Largest High-Performance AI Model," released as open source under the Apache 2.0 license. This model adopts a MoE (Mixture of Experts) architecture with a total of 671B parameters, 37B activations per inference, and a 128K context window. Optimized for the Japanese language, it has outperformed GPT-4o in various Japanese benchmark tests.
This model is the result of the GENIAC project jointly promoted by the Ministry of Economy, Trade and Industry and the New Energy and Industrial Technology Development Organization (NEDO). The Japanese government has provided partial training compute power support for this project. Rakuten described in its announcement that it is "making full use of the best results from the open-source community" as the source of the base model, without specifically naming the model.
The community promptly examined the model file released on Hugging Face and found that the config.json clearly states model_type: deepseek_v3 and `architectures: DeepseekV3ForCausalLM`, with a total of 671B parameters, 37B activations, and a 128K context, which completely matches DeepSeek V3. This indicates that the model is based on DeepSeek V3 and fine-tuned with Japanese data.
