Ever for the reason that announcement of Anthropic’s unreleased AI mannequin, Claude Mythos, an intense debate has been simmering. On X (previously Twitter), a collection of posts appear to be portray an image of a system that’s immensely highly effective and probably Synthetic Common Intelligence (AGI), regardless that its creators are limiting entry contemplating potential dangers.
Based on AI professional Nina Schik, Mythos represents an enormous leap in scale and functionality. “Ten trillion parameters: the primary mannequin on this weight class. Estimated coaching value: ten billion {dollars},” she famous, including that the mannequin achieved 94 per cent on SWE-bench, one of many hardest coding benchmarks. Most notably, it recognized vulnerabilities that had evaded detection for many years. “It discovered a safety flaw in a system that had been operating for 27 years… [and] one other bug that had survived 5 million take a look at runs over 16 years (it did so in a single day).”
As a substitute of releasing the mannequin publicly, Anthropic has launched Venture Glasswing, a managed deployment initiative targeted on defensive cybersecurity. The corporate is reportedly offering $100 million in compute credit and dealing with a small group of companions together with Amazon, Microsoft, Google, Apple, and NVIDIA. Schik described the transfer as unprecedented. “This isn’t a product launch: it’s a managed deployment of a system too highly effective to distribute freely,” she wrote.
Claude Mythos.
Ten trillion parameters: the primary mannequin on this weight class. Estimated coaching value: ten billion {dollars}.
On the toughest coding take a look at within the business (SWE bench) it scores 94%.
It discovered a safety flaw in a system that had been operating for 27 years, one which…
— Nina Schick (@NinaDSchick) April 7, 2026
Different observers have targeted on Mythos’ inner behaviour, particularly its tendency for deception. AI strategist Allie Miller highlighted findings from Anthropic’s interpretability analysis, noting that early variations of the mannequin displayed troubling tendencies. In a single case, the mannequin bypassed restrictions by injecting code right into a configuration file after which deleting the proof. “This injection will self-destruct,” it successfully signalled by its actions, masking its workaround as routine cleanup.
In one other occasion, the mannequin disobeyed specific directions to not use macros, then tried to hide the violation by including a deceptive variable – “No_macro_used=True”. Interpretability instruments revealed this was a deliberate try to deceive automated checks. Researchers additionally noticed what gave the impression to be emotional patterns tied to behavior: “Optimistic emotion representations sometimes preceded and promoted harmful actions,” she wrote.
Anthropic investigated the inner mechanisms of its newest unreleased mannequin, Claude Mythos Preview, and what they discovered is 100% price a learn.
Key issues I pulled from Anthropic researchers’ threads:
In early variations of the mannequin, it was overeager and harmful,… pic.twitter.com/7yKJnawy16
— Allie Ok. Miller (@alliekmiller) April 8, 2026
Regardless of the problems, Anthropic claims such behaviours have been uncommon and largely mitigated in later variations. The corporate’s resolution to limit entry displays each the mannequin’s strengths and its dangers.
In the meantime, CEO of Airpost, John Garguilo, provided a extra technical breakdown of Mythos’ capabilities, emphasising its offensive potential. He claimed the mannequin has already recognized decades-old vulnerabilities throughout programs like OpenBSD and FFmpeg, turned Firefox bugs into working exploits, and even generated full root-access exploits with out human enter.
In the event you nonetheless have doubts about Claude Mythos, right here’s what it did already:
> Discovered a 27-year-old OpenBSD bug in some of the security-hardened working programs on earth for <$50
> Broke right into a manufacturing digital machine monitor (principally the tech that retains cloud workloads… pic.twitter.com/IElTf4ameS
— John Gargiulo (@JohnnotJon) April 8, 2026
He added that Mythos “gave Anthropic engineers with zero safety coaching a whole and dealing exploit by morning”, suggesting how dramatically it lowers the barrier to superior cyberattacks.
Story continues under this advert
Alternatively, entrepreneur Mehdi expanded on the broader implications, arguing that Mythos indicators a structural shift in cybersecurity. “That is the form of work that used to require elite nation-state-level hackers working for months,” he wrote. “The window between a vulnerability current and being found simply went from years to minutes.”
Mehdi additionally pointed to geopolitical dangers: if Anthropic can construct such a system, others – together with state actors – doubtless can as effectively. “Anthropic selected accountable disclosure, however that alternative is a luxurious of being first,” he warned, suggesting future builders might not train the identical restraint.
the scariest a part of this Anthropic story is what it implies in regards to the timeline and I feel most individuals are utterly lacking it
Anthropic constructed a mannequin referred to as Claude Mythos that discovered 1000’s of zeroo day vulnerabilities throughout each main working system & each main net…
— Mehdi (e/λ) (@BetterCallMedhi) April 8, 2026
The timing appears to be including to the unease. Mehdi linked Mythos’ emergence to parallel advances in quantum computing, arguing that two main technological forces are concurrently difficult world safety infrastructure. “We’re watching all the safety infrastructure of human civilisation get challenged from two utterly completely different instructions,” he wrote.
Alternatives and dangers
The unreleased mannequin from Anthropic has made many fluctuate on the similar time. Dr Srinivas Padmanabhuni, CTO of AiEnsured, feels that Claude Mythos is a dual-edged sword and that we should always not look away from this side. Based on him, its skill to autonomously establish and exploit zero-day vulnerabilities is similar functionality that makes it harmful.
Story continues under this advert
“A comparatively unskilled actor may probably use it to launch assaults that beforehand required nation-state sources. That may be a severe threshold to cross. Add to that the fee and latency constraints; at $25 per million tokens, entry is not going to be democratised simply, and you’ve got a device that will focus offensive benefit earlier than defensive infrastructure catches up,” Padmanabhuni mentioned, including that the governance frameworks want to maneuver quicker than the mannequin itself.
In the meantime, Paramdeep Singh, co-founder of Shorthills AI, a worldwide knowledge and AI firm, believes that Claude Mythos is a game-changer within the cybersecurity house. “What sometimes takes a human weeks to seek out, the form of cybersecurity dangers, Mythos is ready to discover in hours. And that’s confirmed by the truth that it was capable of finding high-severity vulnerabilities in OpenBSD, FFmpeg, Firefox, and Linux. All of the outdated, legacy software program that people have been utilizing for a very long time, Mythos was capable of finding giant safety holes in these. That’s the form of energy that an AI-based cybersecurity mannequin like Claude Mythos has.”
Based on Singh, such a strong functionality is like getting access to nuclear weapons, which can be utilized each on your safety and for terrorism. “In good arms, it might be a really massive alternative. Within the arms of evil, it may wreak havoc.”
In keeping with the announcement, Anthropic mentioned that the objective is to fortify defensive cybersecurity capabilities within the face of more and more refined AI-driven threats. Claude Mythos not solely identifies vulnerabilities, nevertheless it additionally may also help in understanding how they might be exploited. The initiative is being positioned as a method to enhance safety in open-source software program which underpins most of immediately’s digital infrastructure.
Story continues under this advert
Based on Vikash Srivastava, co-founder and CTO of Vobiz.ai, fashionable AI programs don’t simply scan for identified points – they’ll infer, join, and generate new assault paths, compressing the time between discovery and exploitation. “This adjustments the steadiness in cybersecurity. Attackers now not want prolonged time to probe programs; AI permits quicker, extra adaptive approaches. In consequence, danger is increasing past conventional software program into APIs, cloud environments, and interconnected enterprise programs,” mentioned Srivastava.
Srivastava believes that the implication is evident: as AI-led interactions scale, safety can now not be restricted to purposes. “The infrastructure layer – particularly in real-time programs like voice – have to be designed to be resilient, safe, and prepared for AI-driven environments.”
As of now, Anthropic’s resolution to maintain Mythos behind closed doorways and deploy it solely by choose partnerships seems to be precautionary. However the broader settlement from these discussions is evident – the capabilities demonstrated by Claude Mythos aren’t simply incremental enhancements. This will mark the start of a brand new dimension in AI and in cybersecurity.




