Today's signal Anthropic built an AI model so dangerous it refused to release it. Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities in every major operating system and web browser. Instead of shipping it, Anthropic called Apple, Google, Microsoft, and nine other tech giants to form Project Glasswing — a coordinated effort to patch the internet before attackers build something similar.
Why it matters This is the first time an AI company has voluntarily withheld a model because its offensive cybersecurity capabilities were too powerful. Mythos Preview found thousands of previously unknown vulnerabilities — including bugs that survived 27 years of human review and millions of automated security scans. Its predecessor, Claude Opus 4.6, could barely turn a discovered bug into a working exploit. Mythos does it 72% of the time. The capabilities weren't deliberately trained. They emerged as a side effect of making the model better at coding and reasoning. That's the part that should worry everyone: the next frontier model from any lab could wake up with the same abilities.
The take Anthropic just proved something the cybersecurity industry has feared for years: AI doesn't need to be trained to hack. It just needs to be trained to code really, really well. The responsible move here — building a coalition of defenders before releasing the model — is smart. But it's also a temporary advantage. If these capabilities emerged by accident in one lab, they'll emerge in others. The real question isn't whether AI can break the internet. It's whether we can patch it faster than the next model ships.
The number 72.4% — the rate at which Claude Mythos Preview converts discovered vulnerabilities into working exploits. Its predecessor managed close to 0%. That jump didn't come from security training. It came from getting better at code.
👉 Read the full breakdown: How Anthropic Built an AI Too Dangerous to Release — and What It Found