Anthropic’s Mythos AI model tests limits of global cyber defences

Anthropic’s new Mythos AI model is raising concern among governments and companies that it could outpace current cyber security defences, turbocharge hacking and expose weaknesses faster than they can be fixed.

The San Francisco-based start-up released a cyber-focused model this month, which has shown the ability to detect software flaws faster than humans but also demonstrated it can generate exploits needed to take advantage of them.

In one alarming case, the Mythos model showed it could break out of a secure digital environment to contact an Anthropic worker and publicly reveal software glitches, overriding the intention of its human makers.

This week, OpenAI also released its own advanced cyber model with similar capabilities.

The developments have led senior international financial officials and government ministers around the world scrambling to understand the dangers, in some cases seeking access to the new models that have only been given to a small number of vetted partners.

“This feels like the discovery of fire: a force that can profoundly improve our lives or, if mishandled, cause real harm across the digital world,” said Rafe Pilling, director of threat intelligence at cyber firm Sophos.

Last week, US Treasury secretary Scott Bessent and Federal Reserve chair Jay Powell summoned some of the largest US banks to discuss the cyber threats the AI model posed. The UK’s AI minister Kanishka Narayan told the FT “we should be worried” about the capabilities of the model.

These risks are well known within Anthropic. Logan Graham, who leads Anthropic’s frontier “red team”, which tests the lab’s models, said: “Somebody could use [Mythos] to basically exploit en masse very fast in an automated way, and most of the organisations around the world . . . including the most technically sophisticated ones, would not be able to patch things in time.”

AI tools have already significantly boosted the multibillion-dollar cyber crime industry. They have provided amateur hackers with cheap tools to write harmful software, as well as enabling professional criminals to better automate and scale their operations.

“Attacks are already increasing in frequency and sophistication, thanks to AI,” said Christina Cacioppo, chief executive at security and compliance firm Vanta.

“Most companies aren’t prepared to handle the risk because they’re still managing security through dated methods that are no match for the speed of AI-enabled attacks,” she added.

AI-enabled cyber attacks were up 89 per cent in 2025 compared with a year earlier, according to data from security group CrowdStrike. Meanwhile, the average time between an attacker first gaining access to a system and acting maliciously fell to 29 minutes last year, a 65 per cent acceleration from 2024.

“The game is asymmetric; it is easier to identify and exploit than to patch everything in time,” said one person close to a frontier AI lab.

Anthropic’s Graham said there were also internal concerns that companies would use Mythos to find “more vulnerabilities than they could hope to deal with in the near future”.

The heightened fears about AI and cyber security come amid signs that agents, which act autonomously on users’ behalf to conduct tasks, could also fuel a further rise in AI-enabled hacking.

Last September, Anthropic detected the first reported AI cyber-espionage campaign believed to be co-ordinated by a Chinese state-sponsored group.

It manipulated its coding product Claude Code to attempt to infiltrate about 30 global targets, including large tech firms, financial institutions, chemical manufacturers and government agencies. It was successful in a small number of cases and executed without extensive human intervention.

Software researcher Simon Willison has warned there is a “lethal trifecta” of capabilities that arise with agents: access to private data; exposure to untrusted content, such as the internet; and the ability to communicate externally.

Security professionals argue that the safest way to protect against cyber attacks when using an AI agent is to grant it access to only two of these areas. However, AI experts believe that much of the value from agents comes from granting access to all three.

“The bad news is that there is no good solution as of today,” said one person close to an AI lab. “The good news is [AI agents aren’t] yet in mission-critical settings like the stock exchange, bank ledger or the airport.”

Recommended

Stanislav Fort, a former Anthropic and Google DeepMind researcher who has founded AISLE, an AI security platform, said he was optimistic that AI could help to identify and fix a “finite repository” of historical security flaws.

To date, AI models have identified thousands of “zero-day” vulnerabilities — unknown weaknesses in commonly used software — some of which have been undetected for decades.

“We are gradually finding fewer and fewer zero days, of the worst kinds we can imagine,” said Fort.

Once these weaknesses were eliminated, the technology could be used to “proactively make sure nothing bad comes in [and] meaningfully increase the security level of the whole world as a result”.

Additional reporting by Kieran Smith in London