NewsFlash Articles Data Fundraising Skill&API

Caltech Open Sources True 1-bit Model Bonsai: 8B Parameters Only 1.15GB, Running at 44 Tokens/s on iPhone

According to 1M AI News, PrismML, an AI lab co-founded by Caltech mathematician Babak Hassibi, has emerged from stealth mode and open-sourced the 1-bit Bonsai series of large language models. The flagship model, 1-bit Bonsai 8B, features 82 billion parameters with a memory footprint of only 1.15 GB, achieving approximately 14x compression compared to a similar 16-bit model (around 16 GB). The weights are licensed under Apache 2.0 and available for download on HuggingFace, along with two smaller models: 4B (0.5 GB) and 1.7B (0.24 GB).

Bonsai 8B is an end-to-end true 1-bit model: the embedding layer, attention layer, MLP layer, and output head are all represented using only +1 or -1 weights, without any high-precision patches. PrismML claims that its inference and language understanding capabilities on standard benchmark tests are on par with a 16-bit full-precision model. The core compression mathematics were developed by the team at Caltech over several years, with the intellectual property belonging to Caltech, and PrismML being the sole exclusive licensee. The model was trained on Google v4 TPUs.

Measured speeds: 136 tokens/s on an M4 Pro Mac, 440 tokens/s on an RTX 4090, and approximately 44 tokens/s on an iPhone 17 Pro Max, while a standard 16-bit 8B model cannot fit on any iPhone. The energy consumption is reduced by around 4-5 times compared to the 16-bit model. PrismML points out that current hardware is not designed for 1-bit inference; the speed and energy efficiency advantages mainly stem from the reduced memory footprint. If future hardware tailored for 1-bit design (involving only addition and subtraction, no multiplication) emerges, efficiency could be further improved by an order of magnitude.

PrismML has completed a $16.25 million SAFE and seed round of financing, with investment from Khosla Ventures, Cerberus Capital, and Caltech. Vinod Khosla, the founder of Khosla Ventures, describes this as "not a small iteration, but a major technological breakthrough, a mathematical breakthrough, not just another small model".

Source

Correction/Report

On-Chain Activity