Anthropic says its new model can autonomously find and weaponize zero-days across major software. If that claim broadly holds up, the real story is not just that AI can hack. It is that exploit development may be becoming cheap, autonomous, and accessible faster than defenders can adapt.
Anthropic’s announcement of Claude Mythos Preview should be read with both seriousness and caution. Seriousness, because if the company’s own assessment is broadly right, this is a threshold event in cybersecurity. Caution, because the public cannot yet independently test most of the claims. Anthropic says more than 99 percent of the vulnerabilities it found remain unpatched, so details are being withheld as part of a coordinated disclosure process.
Even with that caveat, the report is striking. Anthropic says Mythos autonomously found previously unknown vulnerabilities in every major operating system and browser it was directed at, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg. It says the model wrote complex exploits, including a four-vulnerability browser chain, a kernel privilege escalation based on race conditions, and a 20-gadget ROP chain for remote root on FreeBSD. It says the earlier Opus 4.6 model was near zero on the same exploit-development benchmark where Mythos succeeded 181 times.
But the most important claim in the report may be the simplest one: engineers with no security background could let the model run overnight and wake up to working exploits.
From Scarcity to Scale
That is what makes this feel like more than just another model upgrade. Discovery alone would be a big deal. Discovery plus weaponization plus autonomy plus accessibility is something else. It suggests that exploit development may be moving from artisanal work done by a relatively small class of elite researchers into a more industrial process driven by compute, tool use, and persistent iteration.
If that is right, then the story is not simply that AI can help with security research. It is that the economics of offensive cyber may be changing. For years, one constraint on attackers was scarcity: scarce talent, scarce time, and scarce ability to turn obscure bugs into reliable compromise. A model that can search, test, fail, adapt, and keep going for hours changes that. The bottleneck shifts from finding bugs to triaging them, fixing them, validating patches, and deploying updates fast enough to matter.
Three Underappreciated Implications
First, technical debt starts to look like exploit debt. Old code was not safe simply because it survived. In many cases it was just underexamined. A 27-year-old OpenBSD bug or a 16-year-old FFmpeg bug is not only an impressive anecdote. It is evidence that there may be a large inventory of vulnerabilities sitting in mature codebases that were previously too expensive for humans to find and weaponize at scale.
Second, the unit of risk may increasingly be the exploit chain, not the individual CVE. Anthropic highlights a four-bug browser chain for a reason. Real compromise often comes from composition, where several modest weaknesses become dangerous when linked together. As AI systems get better at chaining, risk models centered on isolated vulnerabilities may start to understate real-world exposure.
Third, Anthropic says these cyber capabilities were not the result of explicitly training Mythos for offensive hacking, but emerged from broader gains in reasoning and coding. If true, that matters far beyond this single model. It suggests that governance based only on training intent is too narrow. Dangerous capabilities may emerge as a side effect of general model improvement.
Glasswing: Buying Time
Anthropic’s response, Project Glasswing, should be understood against that backdrop. The company is limiting access to a small set of critical industry partners and open source developers so defenders can patch systems before the model is more widely available. That's a cautious and well‑considered step. It also signals something bigger: frontier AI labs are becoming part of the cybersecurity governance stack. Anthropic is not just releasing a product. It is acting, at least in part, like a coordinated disclosure broker, a security clearinghouse, and a temporary gatekeeper for a potentially destabilizing capability.
Glasswing should be seen for what it is: an attempt to buy time.
Anthropic predicts that, in the long run, models like Mythos will benefit defenders more than attackers. That may happen, especially for the best-resourced organizations. Big browser vendors, hyperscalers, and major software companies can use models to audit code, shrink bug density, and speed up remediation. But that optimistic story rests on a hidden assumption: that defensive diffusion will outrun offensive diffusion.
That is far from guaranteed.
Four Clocks
Cybersecurity now has at least four clocks running at once. There is a capability clock, where model performance improves in months and sometimes jumps suddenly. There is a disclosure clock, where vendors reproduce issues, coordinate fixes, and prepare releases over weeks or months. There is a deployment clock, where real organizations roll out patches over months, quarters, or longer. And there is a diffusion clock, where frontier techniques spread into open models, specialized fine-tunes, reusable agentic workflows, and determined adversarial hands.
Glasswing may slow that fourth clock. It does not stop it.
The Sufficiency Threshold
The crucial point is that open or adversarial systems do not need to match Mythos at the frontier to create serious risk. In cybersecurity, the danger threshold is not parity. It is sufficiency. A weaker model can still be highly dangerous if it is good enough to find exploitable flaws in common targets, adapt known exploit patterns to new environments, chain medium-severity issues into practical compromise, and lower the skill required for mid-tier operators. The relevant question is not, “When will open source equal Mythos?” It is, “When will open and specialized systems become good enough for serious offensive use?”
That threshold may arrive much sooner than full parity.
There is also a structural asymmetry in how safety measures work. Frontier labs can impose guardrails, rate limits, monitoring, account controls, and selective access. Threat actors, once they possess a capable enough model or workflow, can remove all of that friction. In that sense, alignment is a tax on the compliant, not on the determined. A somewhat weaker but unrestricted system can be more operationally dangerous than a stronger model behind heavy controls.
And the capability most likely to spread fastest may not even be the weights. It may be the workflow. Anthropic’s report hints that a large part of the jump comes from autonomous scaffolding: tool use, iterative debugging, search over hypotheses, and the ability to persist through failed attempts. Once that recipe becomes availabe (and it will), many different models may become far more useful than their benchmark rankings would suggest.
Attack Scales Like Software. Defense Scales Like Bureaucracy.
Exploit generation is becoming global and centralized, while patching remains local and fragmented.
One powerful workflow can be applied across countless targets. Remediation, by contrast, is organization-specific. Every environment has compatibility risk, change-management friction, staffing limits, and legacy dependencies. Attack scales like software. Defense still scales like bureaucracy.
That makes the likely outcome uneven. The top of the market may get safer. The long tail may get worse. Major vendors with mature release engineering and strong security teams may use AI to pull ahead. But small vendors, underfunded open source projects, hospitals, municipalities, embedded systems, industrial environments, and companies trapped on aging software may fall further behind. A long-term defender advantage may be true for the top slice of the ecosystem and false for the internet overall.
Beyond Shift-Left
So where does this leave the old shift-left instinct? It is still right, but it is no longer enough. If exploit discovery and weaponization are becoming cheap, defenders need to improve across the whole lifecycle. They need better secure development, AI-assisted code review, and threat modeling on the left. They need memory-safe languages, stronger sandboxing, safer parsers, and reduced blast radius in the middle. They need faster detection, rollback, auto-update, and recovery on the right. And they need to fund and maintain the open source foundations that sit upstream of nearly everything.
Above all, they need patch velocity.
In a world of autonomous exploit generation, release engineering becomes security engineering.
The organizations that win are not merely the ones that can find vulnerabilities. They are the ones that can fix them, test them, backport them, and deploy them at machine speed without breaking production.
That is the real challenge hidden inside Anthropic’s announcement. If Mythos marks a step change in offensive capability, then the defender advantage will not emerge automatically from the existence of a stronger model. It will depend on whether this temporary lead is converted into permanent reductions in vulnerability density and exploitability before the methods themselves diffuse.
So yes, Anthropic may be right about the destination. But its own report points to a harder near-term truth. Project Glasswing is not a solution. It is a shrinking patch window. If defenders use that window to eliminate bug classes, harden the most exposed software, and radically speed up remediation, this moment may eventually look like the start of a safer era. If they do not, Mythos may be remembered less as a tool for defense than as the moment exploit scarcity ended.
Disclaimer: These are my personal thoughts and do not reflect the views of my current employer or any previous employers.