Anthropic announced the launch of Project Glasswing, a cybersecurity initiative that pairs its unreleased model, Claude Mythos Preview, with a coalition of twelve major tech and finance companies in an effort to find and patch software vulnerabilities across critical infrastructure before hackers can exploit them. At the moment, the company does not plan to make the model available to the public.
Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, told VentureBeat:
“We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities. However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout — for economies, public safety, and national security — could be severe.”
Anthropic says it has already identified thousands of high-severity zero-day vulnerabilities in every major operating system and web browser using the model including a 27-year-old vulnerability in OpenBSD, a security-focused operating system, and a 16-year-old vulnerability in FFmpeg, a very commonly used video encoding and decoding library.
Project Glasswing launch partners include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks, and Anthropic says it has extended invites to more than 40 other organizations that build or maintain critical software.
Is this top level extortion? (You might want to pay for this model and see what it can do before we release it…) Or a company preemptively taking accountability for the impact its product might have on the world and making a genuine responsible attempt at mitigating the risks?
I'd like to believe it's the latter, but forgive my skepticism. The AI industry has so far earned it.

