header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

Tinygrad claims that GLM 5.2 can achieve 120 tok/s on a dual-machine interconnected Blackwell configuration, with a price of $150,000.

BlockBeats News, June 21st, GPU seller Tinygrad announced that, according to reliable sources, the GLM 5.2 model can achieve a inference speed of 120 tokens per second on two interconnected Blackwell architecture tinybox units.


The price of this configuration is $150,000, with the option to choose either two standard tinybox units or a single tinybox Pro unit, both capable of delivering the aforementioned performance. Tinygrad highlights the selling point of "one-time purchase, never pay for cloud fees" for private deployment, directly competing with on-demand cloud-based inference services.


Currently, this news has not been officially confirmed by the GLM team, and Tinygrad has not disclosed more technical details.


---------------------------------
Click the original article link below to join the BlockBeats · Lark AI News Channel, which monitors global AI hot topics and news 24/7.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish