According to Dynasty Beating Monitoring, the DeepSeek V4 Technology Report has unveiled for the first time the core infrastructure supporting post-training of the Agent and large-scale evaluation, the production-level elastic compute sandbox DSec (DeepSeek Elastic Compute).
Current large-scale model reinforcement learning requires an extremely large-scale trial-and-error environment. The report reveals that in actual production, a single DSec cluster can concurrently schedule tens of thousands of sandbox instances. The system is implemented in Rust and interfaces with the in-house 3FS distributed file system. By using on-demand loading, it breaks through the performance bottleneck of cold-starting a large number of sandboxes.
In terms of developer experience, DSec has unified function invocation, containers, micro VMs, and full VMs into a set of Python SDK, allowing for a parameter change to switch between the four execution bases. To address the common problem of task preemption in computing clusters, DSec has introduced a global trajectory log: when a task resumes, the system will directly "fast-forward" to replay the cached command execution results, achieving both rapid checkpoint continuation training and avoiding non-idempotent errors caused by repeated execution.
