BlockBeats News, March 17th, Stablecoin issuer Tether announced that its AI platform QVAC Fabric has launched the world's first cross-platform LoRA fine-tuning framework for Microsoft BitNet (1-bit LLM), enabling billion-parameter-level language models to be trained and inferenced on ordinary hardware, including laptops, consumer-grade GPUs, and smartphones.
Officially, the framework significantly reduces the VRAM and compute power threshold required for AI model training, supporting Intel, AMD, Apple Silicon, and various mobile GPUs (such as Adreno, Mali, Apple Bionic).
In testing, a roughly 125 million parameter BitNet model can be fine-tuned on a Samsung S25 in about 10 minutes; a 10 billion parameter model can be fine-tuned on a Samsung S25 in about 1 hour 18 minutes, on an iPhone 16 in about 1 hour 45 minutes, and the team has even successfully fine-tuned a 13 billion parameter model on an iPhone 16.
In terms of performance, the BitNet model's inference speed on mobile GPUs can be 2 to 11 times higher than on a CPU. At the same time, tests show that the VRAM consumption of BitNet-1B in inference and fine-tuning tasks can be reduced by up to 77.8% compared to a 16-bit model.
Paolo Ardoino stated that this technology aims to reduce reliance on large-scale cloud computing and specialized AI hardware, allowing AI model training to be completed on local devices and providing a foundation for new models such as decentralized AI and federated learning.
