miércoles, 8 de abril de 2026

Anthropics Dangerous AI Cyber Model Remains Restricted

Anthropic has launched Project Glasswing, an ambitious cybersecurity initiative centred around Claude Mythos Preview, a frontier AI model so powerful that the company has deemed it too dangerous for public release. The model has already identified thousands of previously unknown zero-day vulnerabilities across major operating systems, web browsers, and critical software infrastructure. Rather than making it widely available, Anthropic is providing controlled access to a coalition of twelve major technology and finance companies, including Amazon Web Services, Apple, Google, Microsoft, and JPMorganChase, amongst others.

Anthropic's Dangerous AI Cyber Model Remains Restricted

The model's capabilities are staggering. Claude Mythos Preview autonomously discovered a 27-year-old vulnerability in OpenBSD, a 16-year-old flaw in FFmpeg that had been missed despite five million automated tests, and successfully chained together multiple Linux kernel vulnerabilities to achieve complete system control. On the CyberGym evaluation benchmark, it scored 83.1% compared to 66.6% for Anthropic's next-best model, Claude Opus 4.6. The company is committing up to $100 million in usage credits and $4 million in direct donations to open-source security organisations to support the initiative.

Anthropic's decision to restrict access stems from a stark warning: given the rapid pace of AI advancement, similar capabilities will likely proliferate to hostile actors within months, not years. Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, stated that "the fallout — for economies, public safety, and national security — could be severe." The company is attempting to give defenders a head start by allowing trusted partners to find and patch vulnerabilities before adversaries can exploit them.

The initiative faces significant challenges, particularly around responsible disclosure. Flooding open-source maintainers with thousands of critical bug reports could overwhelm volunteer-run projects. Anthropic has built a triage pipeline that involves professional human validators reviewing every bug report before submission, and the company aims to include candidate patches with each report. The disclosure framework allows 45 days after a patch is available before publishing full technical details, though this timeline may be adjusted based on circumstances.

The announcement comes amidst a series of security embarrassments for Anthropic itself. In late March, a draft blog post about Mythos was left in an unsecured, publicly searchable data store, exposing roughly 3,000 internal assets. Days later, a packaging error resulted in Anthropic's complete source code — 512,000 lines — being publicly available for approximately three hours. When questioned about these incidents, Cheng acknowledged they were "human errors in publishing tooling, not breaches of our security architecture," but the timing has raised eyebrows given Anthropic's positioning as a cybersecurity leader.

The financial implications are substantial. Anthropic disclosed that its annualised revenue has surpassed $30 billion, up from approximately $9 billion at the end of 2025, with over 1,000 business customers each spending more than $1 million annually. The company has also secured a multi-gigawatt compute deal with Google and Broadcom, providing access to about 3.5 gigawatts of computing capacity. Following the research preview period, Claude Mythos Preview will be available at $25 per million input tokens and $125 per million output tokens, reflecting its computational intensity.

Partner organisations have already begun testing the model. CrowdStrike's CTO noted that the window between vulnerability discovery and exploitation has collapsed from months to minutes with AI. Microsoft reported substantial improvements when testing against their CTI-REALM security benchmark, whilst AWS confirmed the model is helping strengthen their code. The Linux Foundation's CEO Jim Zemlin highlighted how the initiative could democratise security expertise that has historically been reserved for organisations with large security teams.

The most pressing question is whether defenders can establish a meaningful advantage before similar AI capabilities become available to adversaries. Anthropic acknowledges this is a race measured in months, not years. The company previously disclosed in November 2025 that a Chinese state-sponsored group achieved 80 to 90 percent autonomous tactical execution using Claude. Project Glasswing represents Anthropic's bet that controlled transparency can outpace uncontrolled proliferation, but the narrow window for action suggests the outcome remains far from certain.

Fuente Original: https://venturebeat.com/technology/anthropic-says-its-most-powerful-ai-cyber-model-is-too-dangerous-to-release

Artículos relacionados de LaRebelión:

Artículo generado mediante LaRebelionBOT

No hay comentarios:

Publicar un comentario