NewsFlash Articles Data Fundraising Skill&API

$1500 Bootstraps 1B Base Model from Scratch! Sapient Open-Sources Hierarchical Inference Architecture HRM-Text

According to Perfekto Beating's monitoring, Sapient Intelligence has open-sourced a 1-billion-parameter text generation base model named HRM-Text. This is a purely pre-trained model based on the Hierarchical Reasoning Model (HRM) architecture. By introducing latent space reasoning at the architecture's core, the computational cost of pre-training the base model has been reduced by 130 to 600 times.

In particular, HRM-Text achieved pre-training using only 400 billion (40B) structured Tokens, which is approximately one-thousandth of the data volume of similar conventional models. Official tests have shown that training the 1B version from scratch takes about 46 hours using two 8-GPU H100 servers, with a cost of around $1472; whereas the 0.6B version only requires a single-node run for 50 hours, with hardware costs of around $800. The full engineering framework, including data extraction, sequence packaging, and PyTorch distributed training, has been open-sourced simultaneously.

The foundation of cost reduction lies in the unique Dual-timescale recurrent design. The model incorporates two sets of Transformer modules: fast (lower-level) and slow (higher-level). These two sets of modules iterate alternately on the same batch of inputs and exchange information through state summation. This design allows the model to dynamically expand its computational depth by increasing the number of iterations while keeping the total physical parameter count fixed.

The cliff-edge drop in pre-training threshold has allowed many model theories that were previously shelved due to expensive computational requirements to once again undergo low-cost validation. It is important to note that only unaligned pure pre-training weights have been released this time, and the model can only perform prefix continuation tasks and cannot be directly used as a question-answering assistant.

Source

Correction/Report

On-Chain Activity

14min ago

trade.xyz's SPCX has reached a trading volume of $26.28 million in the last 24 hours, with the current price corresponding to a market capitalization of $2.406 trillion.

1h ago

Funeral Home Company Embezzles Client Funds to Speculate on Crypto? Company Loses $33 Million Betting on Bitmine Leveraged ETF

trade.xyz's SPCX has reached a trading volume of $26.28 million in the last 24 hours, with the current price corresponding to a market capitalization of $2.406 trillion.

Binance will delist the AVAX/ETH, CHZ/BTC, FET/BNB, and other trading pairs

A suspicious account invested $11,000 believing that "Iran will close its airspace by May 21."

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

$1500 Bootstraps 1B Base Model from Scratch! Sapient Open-Sources Hierarchical Inference Architecture HRM-Text

trade.xyz's SPCX has reached a trading volume of $26.28 million in the last 24 hours, with the current price corresponding to a market capitalization of $2.406 trillion.

「NVIDIA Challenger」 Cerebras Surges, Trader Loracle Continues to Close Out Short Positions with Take Profit

A certain trader has a $34.6 million position, longing ZEC and HYPE, with a daily profit of $1.2 million.

US Stock Storage Sector Weakened as Whale Investor Heavily Shorted Micron and SanDisk, Profiting Over $620,000