NewsFlash Articles Data Fundraising Skill&API

Nous Open Sources Lighthouse Attention: Single B200 with 512K Cache Accelerates 17x

According to Dongcha Beating monitoring, Nous Research has open-sourced the Lighthouse Attention, a long-context pre-training mechanism. When processing 512K-length text on a single B200 GPU, this solution's computation speed is about 17 times faster than the traditional mechanism and achieves a 1.4 to 1.7 times end-to-end training speedup at a length of 98K.

The traditional attention mechanism requires computing the pairwise relationships of all words. As the text gets longer, the computing power required increases exponentially. Lighthouse Attention takes a different approach by screening first and then refining the calculation. It rapidly browses compressed summaries of the text at different levels, selects core segments based on scores to form short texts, and then directly processes them using the efficient FlashAttention operator. Because the screening logic is completely decoupled from the kernel, developers are spared the hassle of writing low-level code and do not need to add extra training objectives.

Past acceleration solutions using a similar approach often had side effects. After the model gets used to skipping during reading, it can easily lose its original ability to read word by word. To avoid this pitfall, the research team has the model first run most of the training in acceleration mode, only briefly switching back to the traditional full attention calculation at the end of training for a brief adaptation. In tests on a 5.3 billion-parameter model fed with 500 billion Token training data, the model trained in this way not only significantly reduced training time but also ultimately performed equally well as or even outperformed the baseline version trained throughout using the traditional method.

Source

Correction/Report

On-Chain Activity

59min ago

The probability of the Federal Reserve keeping interest rates unchanged in June is currently at 98.7%.

WhatsApp will introduce an "Incognito Mode" in its AI chat with Meta

Argentum AI Signs $2.5 Billion Data Center Agreement with Cloud Computing Company and Real Estate Enterprise

WeChat Reading Launches Exclusive Skill, Supporting AI Direct Connection to Personal Bookshelf and Reading Notes

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

Nous Open Sources Lighthouse Attention: Single B200 with 512K Cache Accelerates 17x

「Buddy」 deposited 250,000 USDC into Hyperliquid and continued to add to their ETH long position

gammafund.eth has transferred 5480 ETH to Binance, equivalent to approximately $11.93 million

Previously purchased an average of 647.137 ETH at a cost basis of $3.45 per ETH after hodling for 1 year

Loracle.hl Address Hits 5x Short Profit Target, Overall Profit Surges to $41.43 million