Wednesday, April 08, 2026

They Built a Hacking Machine and Called It Safety


Anthropic just released an AI that can crack 27-year-old software bugs, break into every major web browser, and escalate a Linux exploit to full root access. Then they said the quiet part out loud: they're not releasing it to the public.

The name is Claude Mythos Preview. The project is called Glasswing. The press release is a masterclass in the genre of "we did the scary thing responsibly."

Let me translate.

Anthropic spent the last several weeks running their new model against real software. Not toy examples. Not capture-the-flag exercises. Real operating systems, real network stacks, real code running on real machines somewhere in the world right now.

The model found thousands of previously unknown vulnerabilities. It found a 27-year-old bug sitting in OpenBSD. It found a 16-year-old hole in FFmpeg, which is embedded in practically every piece of media software on the planet. It chained multiple Linux kernel vulnerabilities together and climbed to root.

Then it wrote the exploit code. Autonomously. Without a human in the loop.

That last part deserves to sit in its own paragraph.

No one typed instructions at each step. The model did the reconnaissance, identified the weakness, built the attack, and delivered it. The only human decision required was what target to point it at.

Anthropic is calling this a win for defense. Their argument is that finding bugs before the bad guys find bugs is better than finding them after. This is correct as far as it goes.

The problem is where it stops going.

The model exists now. The capabilities exist now. The knowledge of how to build this kind of system exists now, and Anthropic just published a 15,000-word technical blog post explaining its scaffold in detail. That's not a disclosure policy. That's a curriculum.

The tech press will spend the next two weeks debating whether Anthropic is a hero or a villain. That's the wrong frame entirely.

The right frame: we are watching the infrastructure of digital security become a competitive market, and Anthropic just rang the opening bell.

I've been inside this game. I ran software at scale when the internet was still figuring out what it was. I watched what happened when cryptography moved from university labs to corporate products to commodity toolkits. The pattern is always the same. A capability that requires genius to build requires a lot less genius to use once it's been built. And the time between "genius builds it" and "anyone can use it" keeps shrinking.

What Anthropic has demonstrated is that the automation of sophisticated cyberattacks is no longer a theoretical concern. It is an engineering project with a known recipe.

Their solution is access controls. A gated program, partners vetted by Anthropic, model usage credits disbursed to the right people. Cisco. Microsoft. Google. JPMorganChase. The Linux Foundation.

These are not small names. These are the people who already control the pipes.

And that is exactly the problem.

The communities who will absorb the cost of a more dangerous internet are not at that table. The rural hospital running Windows XP because the IT budget was zero. The municipal utility whose SCADA system predates the iPhone. The small nonprofit news organization hosting its content on infrastructure it barely understands. The school district whose cybersecurity plan is a laminated sheet in the front office.

Anthropic committed $100 million in model usage credits to Project Glasswing. They donated $2.5 million to open-source security foundations. This is real money. It is also not the money that gets a part-time IT contractor into a room with a frontier AI model that can find bugs in the software running your water treatment plant.

The defense will get stronger at the top. The gaps will get wider everywhere else.

There's a researcher at ETH Zurich named Lennart Maschmeyer who makes the optimistic case in a new paper. His argument is that AI is better at detection than deception, better at defense than offense, and that the gap between the two actually widens as the stakes rise. Nation-states hitting sophisticated targets will find AI makes attacks harder, not easier, because deception is hard and detection is where AI excels.

This is a reasonable argument. I want it to be true.

But Maschmeyer's framework relies on defenders adopting AI automation at roughly the same pace as attackers. He even names the condition explicitly: "vulnerabilities need to be patched." And the pace of patching, he notes, is "still often determined by bureaucracy and fragmented responsibilities."

That's the Front Range in three words. Fragmented. Responsibilities.

Colorado has over 270 municipalities. It has hospitals in places where the nearest urban IT support is two hours away. It has agricultural cooperatives whose water management software was written when Windows 95 was current. The gap between what Anthropic's Glasswing partners can do and what a Weld County irrigation district can do is not a gap that $100 million in enterprise usage credits will close.

The optimistic scenario requires everyone to upgrade at once. The real scenario is that the organizations with resources get AI-powered defenses, and the organizations without resources get AI-powered attackers pointed at them by whoever gets access next.

Anthropic says they won't release Mythos Preview to the general public. This is not a permanent condition. It is a posture. The posture will change when competition demands it, when a regulator blinks, when a foreign government releases their own version and the argument for restraint evaporates overnight.

I have been in the rooms where these decisions get made. I know what happens when a capability exists and a market reward exists. The two eventually find each other.

The question is not whether this tool gets loose. It's whether the communities without lobbyists and usage credits and SVPs of Security will have anything to hold against it when it does.

Right now the answer is no.

That should make you furious. It makes me furious. The infrastructure of civic digital life is about to face a machine that can find the holes in it faster than anyone can patch them, and the people designing the access controls are the same people who already own the network.

The glasswing butterfly, for which this project is named, has transparent wings. You can see right through it.

Anthropic chose that name because they think it means clarity and resilience.

I think it means we should be watching very carefully what moves through it.


------------------------------------------------------------------------------------------------------------------


**CLAIMS AND SOURCES**

1. Claude Mythos Preview identified a 27-year-old vulnerability in OpenBSD's TCP SACK implementation.

Source: https://red.anthropic.com/2026/mythos-preview/?utm_source=tldrai

Source: https://ftp.openbsd.org/pub/OpenBSD/patches/7.8/common/025_sack.patch.sig

2. Claude Mythos Preview identified a 16-year-old vulnerability in FFmpeg, used by a vast range of media software worldwide.

Source: https://red.anthropic.com/2026/mythos-preview/?utm_source=tldrai

Source: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/22499/files

3. Claude Mythos Preview autonomously discovered and fully exploited a 17-year-old remote code execution vulnerability in FreeBSD (CVE-2026-4747) without any human involvement in discovery or exploitation.

Source: https://red.anthropic.com/2026/mythos-preview/?utm_source=tldrai

Source: https://nvd.nist.gov/vuln/detail/CVE-2026-4747

4. Mythos Preview chained multiple Linux kernel vulnerabilities together to achieve local privilege escalation to root.

Source: https://red.anthropic.com/2026/mythos-preview/?utm_source=tldrai

Source: https://github.com/torvalds/linux/commit/e2f78c7ec1655fedd945366151ba54fcb9580508

5. Mythos Preview identified vulnerabilities in every major web browser and autonomously constructed JIT heap spray exploits.

Source: https://red.anthropic.com/2026/mythos-preview/?utm_source=tldrai

6. Anthropic committed $100 million in model usage credits to Project Glasswing.

Source: https://www.anthropic.com/glasswing

7. Anthropic donated $2.5 million to Alpha-Omega and OpenSSF as part of Project Glasswing.

Source: https://www.anthropic.com/glasswing

8. Project Glasswing partners include Amazon Web Services, Cisco, Microsoft, Google, JPMorganChase, the Linux Foundation, CrowdStrike, and Palo Alto Networks.

Source: https://www.anthropic.com/glasswing

9. Anthropic does not plan to make Claude Mythos Preview generally available.

Source: https://www.anthropic.com/glasswing

10. A Chinese government hacking group used Anthropic's Claude to automate a cyberattack that compromised several targets in 2025.

Source: https://www.lawfaremedia.org/article/the-ai-revolution-in-cyber-conflict [Verified during research; cite as Lawfare, Lennart Maschmeyer, April 8, 2026]

11. Researcher Lennart Maschmeyer (ETH Zurich) argues that AI is better suited to cyber defense than offense because it excels at detection but struggles with deception, and that defense advantages widen at higher-stakes targets.

Source: https://www.lawfaremedia.org/article/the-ai-revolution-in-cyber-conflict

12. Maschmeyer's framework identifies patch speed as a critical condition for defenders to benefit: "the pace of patching is still often determined by bureaucracy and fragmented responsibilities."

Source: https://www.lawfaremedia.org/article/the-ai-revolution-in-cyber-conflict

No comments:

They Built a Hacking Machine and Called It Safety

Anthropic just released an AI that can crack 27-year-old software bugs, break into every major web browser, and escalate a Linux exploit to ...