NewsFlash Articles Data Fundraising Skill&API

Cloudflare Reveals Anthropic Mythos Test Results: Can Now Author Code Autonomously, Stitch Low-Risk Vulnerabilities into Full Attack Chains

According to Motion Beating monitoring, Cloudflare today announced the test results of its participation in the Anthropic internal security project Project Glasswing. In testing across its own 50+ code repositories, Cloudflare confirmed that the security model Mythos Preview has overcome the bottleneck of previous large models. It can not only discover isolated system flaws but also link multiple low-risk vulnerabilities together to autonomously generate executable proof-of-concepts (PoC) by code synthesis.

Previous models like Opus 4.7 or GPT-5.5 often stalled at the stage of outputting vulnerability analysis reports during testing. Mythos, on the other hand, possesses sandboxed closed-loop validation capabilities. It writes the code to trigger the vulnerability, compiles, and runs it; if the execution fails, the model automatically reads the error message, corrects assumptions, and retries until the entire attack chain is successfully executed.

Cloudflare revealed that some security teams in the industry have been pushed to meet the extreme standard of patching within 2 hours. However, Cloudflare emphasized that simply compressing the patch time could lead to greater system failures due to skipping regression testing. The future focus of defense must shift to cutting off code connectivity at the architectural level.

In terms of engineering scheduling, Cloudflare found that a single-threaded programming agent quickly exhausts context and is inadequate for large-scale vulnerability discovery. To address this, they built a parallel adversarial framework where one agent searches for vulnerabilities within a very narrow range, while another agent with a different model specifically refutes the former's conclusions. This adversarial mechanism significantly filters out the extensive false-positive noise generated during model scanning.

As this test used an unrestricted preview version, Mythos exhibited highly unstable internal guardrails. Faced with the same segment of target code, merely changing the contextual description of the runtime environment caused the model to shift from refusing to execute to directly providing an attack payload. Cloudflare warned that the model's autonomously generated endogenous guardrails are extremely fragile, and when released to the public in the future, must be forcibly overlaid with external defenses.

Source

Correction/Report

On-Chain Activity