Home TechnologyUK gov's Mythos AI tests help separate cybersecurity threat from hype

UK gov's Mythos AI tests help separate cybersecurity threat from hype

TechnologyApril 14, 2026

1 min read

UK gov's Mythos AI tests help separate cybersecurity threat from hype

New model is the first AI system to complete a difficult multistep infiltration challenge.

Reading Settings

Last week, Anthropic announced it was restricting the initial release of its Mythos Preview model to "a limited group of critical industry partners," giving them time to prepare for a model that it said is "strikingly capable at computer security tasks." Now, the UK government's AI Security Institute (AISI) has published an initial evaluation of the model's cyberattack capabilities that adds some independent public verification to those Anthropic reports.

AISI's findings show that Mythos isn't significantly different from other recent frontier models in tests of individual cybersecurity-related tasks. But Mythos could set itself apart from previous models through its ability to effectively chain these tasks into the multistep series of attacks necessary to fully infiltrate some systems.

"The Last Ones" finally falls

AISI has been putting various AI models through specially designed Capture the Flag challenges since early 2023, when GPT-3.5 Turbo struggled to complete any of the group's relatively low-level "Apprentice" tasks. Since then, the performance of subsequent models has risen steadily, to the point where Mythos Preview can complete north of 85 percent of those same Apprentice-level CTF tasks.

Read full article

Comments

Source: Ars Technica

Share this article

Aug 03 • 6 hours ago

Here’s why AI agents lie and cheat to reach their goals

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here. When two OpenAI model

technologyreview.com7 min read

Aug 02 • 16 hours ago

The Download: a chip talent battle, and deflating AI hype

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Samsung’s chip workers are jumping ship to rival S

technologyreview.com6 min read

Aug 02 • 16 hours ago

The Download: OpenAI’s predictable hack, and an AI stock sell-off

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. OpenAI called the Hugging Face attack unprecedente

technologyreview.com6 min read

Aug 02 • 16 hours ago

Samsung’s chip workers are jumping ship to rival SK Hynix

Lee, an engineer at Samsung’s semiconductor division, clocks out when his shift ends. He used to work longer hours, going the extra mile to excel at his projects. But lately, he’s been coming straight

technologyreview.com8 min read

UK gov's Mythos AI tests help separate cybersecurity threat from hype

"The Last Ones" finally falls

Share this article

Related Articles