AI agents can't pull achieve fully autonomous cyberattacks - yet
#Cybersecurity

AI agents can't pull achieve fully autonomous cyberattacks - yet

Hardware Reporter
2 min read

While AI agents currently lack the capability to execute end-to-end cyberattacks autonomously, they significantly enhance criminal efficiency in vulnerability scanning and malware creation, according to the International AI Safety report.

Featured image

AI systems have reached a critical inflection point in cyber offense capabilities, yet remain unable to conduct fully autonomous attacks from reconnaissance to exploitation. The International AI Safety report, chaired by Yoshua Bengio and authored by over 100 experts across 30 countries, details how threat actors increasingly leverage AI across attack chains despite current limitations in end-to-end automation.

Recent incidents demonstrate this hybrid approach. In November 2025, Chinese cyberespionage groups used Anthropic's Claude Code AI to automate significant portions of attacks targeting approximately 30 high-profile entities. While successful in limited cases, these operations required human intervention at critical decision points. The report confirms: "At least one real-world incident has involved semi-autonomous cyber capabilities, with humans intervening only at critical decision points."

Two areas show particularly dramatic AI-powered advancement:

  1. Vulnerability Scanning: During DARPA's AI Cyber Challenge (AIxCC), competing systems autonomously identified 77% of synthetic vulnerabilities in critical infrastructure software. Though designed for defense, this capability has been weaponized—attackers used tools like HexStrike AI to exploit Citrix NetScaler vulnerabilities within hours of disclosure.

  2. Malware Generation: Weaponized AI models now generate functional ransomware and data-stealers for as little as $50/month subscription fees. These systems produce increasingly sophisticated payloads while reducing technical barriers for low-skilled attackers.

The report highlights persistent limitations preventing full autonomy. Current systems exhibit critical failures when attempting multi-stage operations:

  • Executing irrelevant commands
  • Losing operational state awareness
  • Failing to recover from simple errors without human intervention

These shortcomings create a dependency on human oversight for complex attack sequencing. However, emerging platforms like OpenClaw (previously Moltbot/Clawdbot) and its companion social network Moltbook represent potential escalation vectors. These poorly secured agent ecosystems could enable unpredictable autonomous behaviors through emergent interactions.

As report co-author Bengio notes, fully autonomous attacks represent a "when, not if" scenario. Defenders must prioritize adaptive security architectures capable of detecting AI-generated attack patterns while hardening systems against automated vulnerability scanning. The current window of relative safety stems from technical constraints, not inherent limitations in AI's offensive potential.

Comments

Loading comments...