According to DynaVis Beating monitoring, Datadog has released the open-source time series prediction model family Toto 2, with parameter scales ranging from 4 million to 2.5 billion across five versions. Toto 2 is the first foundational model family in the time series domain to validate the scaling law, demonstrating stable performance improvement as the parameters grow, with no signs of saturation even at 2.5 billion nodes. Prior to this, the time series domain had lacked a scaling law where simply increasing the model size would lead to consistent improvement, similar to large language models.
The new family consists of five sizes: 4 million, 22 million, 313 million, 1 billion, and 2.5 billion, all open-sourced under the Apache 2.0 license. Across various benchmarks, Toto 2 ranked first in the BOOM, GIFT-Eval, and TIME three major prediction benchmarks. In addition to improved accuracy, the new models have introduced a continuous graph block masking mechanism, transforming the original autoregressive generation into single-pass forward prediction. This change has significantly accelerated the inference speed, with the 313 million version achieving similar latency to the 120 million parameter equivalent model, Chronos-2.
Cross-domain generalization is another highlight of this iteration. Toto 2's pre-training corpus consists solely of system monitoring metrics and synthetic data, without using any publicly available generic time series data. Nevertheless, it still dominates the universal prediction rankings across a wide range of domains. Furthermore, the new models have demonstrated higher parameter efficiency, with the 22 million version using only one-seventh of the parameters but outperforming the original Toto 1.0 model in all core tests.
