The gap between a vulnerability becoming public and attackers weaponizing it has shrunk from weeks to roughly a day. With AI tools now discovering and exploiting flaws at machine speed, defenders are shifting attention from raw patch counts to a harder question: which exposures can actually be used against us, and would our controls catch it?

For most of its history, vulnerability management quietly depended on a luxury nobody talked about: time. A flaw would get disclosed, and somewhere between that disclosure and the first working exploit there was a buffer of weeks or months. That buffer is what made the standard playbook function. You triaged by severity, scheduled the fix into the next change window, validated it, and moved on. Remove the buffer, and the playbook stops working, even if your team is doing everything right.
That is roughly what has happened over the past two years, and the data backing it up is hard to argue with.
The discovery side went exponential
The clearest signal comes from how fast vulnerabilities are now being found. In its May 2026 update, Anthropic reported that it and roughly 50 partners used a gated frontier model to surface more than 10,000 high- or critical-severity vulnerabilities in widely used software in a single month. Pointed at Firefox, the same gated model reportedly produced 181 working exploits where the previous generation managed two. One of the bugs it turned up in OpenBSD had apparently gone unnoticed for 27 years.
Whatever you think of vendor-reported numbers, the direction is unambiguous. Work that used to require rare, specialized talent now runs at the speed and scale of a compute budget. And finding bugs is only half of it. A February 2026 AWS threat-intelligence writeup described an intrusion campaign that needed no zero-days at all, just weak credentials, automated through a custom tooling server that ran offensive utilities autonomously against hundreds of devices across dozens of countries. The expertise barrier that used to slow attackers down is eroding from both ends.
Time-to-exploit is the number that actually matters
Defenders have long tracked time-to-exploit (TTE): the gap between a CVE going public and its first confirmed exploitation in the wild. For years that number sat comfortably in the weeks-to-months range, which is exactly why scheduled patching worked. Recent measurements put it closer to a day for the fastest-moving flaws. When the offense operates in hours and remediation operates in weeks, the breach lands in the space between, almost by default.
Verizon's 2026 Data Breach Investigations Report reflects the pressure. It ties roughly a third of initial-access activity to vulnerability exploitation and notes the trend is climbing, in part because AI coding assistants now put exploit-building and tool-porting within reach of attackers who never had those skills before. More uncomfortably, the same report shows remediation moving the wrong way: median fix time for known-exploited vulnerabilities rising year over year, and the share of organizations fully patching them falling. Even strong performers close only a minority of known-exploited bugs in the first week after detection, a figure that has barely improved despite sustained spending.

"Just patch faster" runs into physics
The instinctive fix, echoed by regulators and boards alike, is to compress patch timelines toward same-day for critical issues. The intent is right. The mechanics are stubborn. Patches still have to clear regression testing, wait for change windows, collect approvals, and respect uptime and compliance commitments. Yanking production offline to outrun an exploit is not a fix, it is a self-inflicted outage with extra steps.
This is where a lot of practitioners have landed: telling teams to patch faster does not change the underlying constraints, and the gap between disclosure volume and remediation capacity keeps widening. The median organization is now patching meaningfully more known-exploited vulnerabilities per year than it was, and that count predates the flood of AI-discovered flaws still working through disclosure pipelines.
Severity scores stopped scaling
The older model assumed a manageable trickle of criticals. Score them with CVSS, fix the worst first, repeat. That logic breaks when hundreds or thousands of high-severity disclosures arrive in a quarter. A backlog where nearly everything is rated a 9 or a 10 is not a prioritized list, it is noise wearing a number.
The more useful questions are harder and more specific. Is this flaw actually reachable in our environment? Would our existing controls already block it? Does it chain into anything that touches assets we care about? Severity scores answer none of those. They describe the vulnerability in the abstract, not the risk to a particular organization with a particular stack of defenses.
The strategy shift: validate, don't just enumerate
This is the reasoning behind the growing interest in Breach and Attack Simulation (BAS) and the broader category Gartner now describes as Adversarial Exposure Validation, which folds together control effectiveness ("are my defenses actually working?") with business context ("which assets are reachable and worth protecting?").
The core idea is to stop reasoning about vulnerabilities in theory and start testing them in place. BAS takes real adversary techniques and safely runs them against your live prevention and detection stack. It is not a scan and not a paper mapping, it is an exercise that shows what your tools block, what they merely detect, and what slips through untouched. Practically, that does three things a vulnerability list cannot:
- It separates theoretical risk from real risk. A flaw your WAF, IPS, and EDR already neutralize is a fundamentally different problem from one that walks straight in. Knowing which is which stops every CVE from being treated as a five-alarm fire.
- It validates controls you already bought. Many enterprises run dozens of security products with overlapping, drifting policies. Testing whether they actually fire as configured surfaces the gaps hiding between them.
- It buys time to patch safely. If you can prove a critical asset is already covered by hardened controls, the patch can move through normal change control instead of an emergency push. If it is not covered, you mitigate first and patch on a sane schedule.
Paired with autonomous penetration testing, which tries to chain exposures from an initial foothold toward high-value targets, validation starts to answer the two questions that matter together: can they get in, and would we notice.
Machine-speed offense needs machine-speed validation
There is a catch worth being honest about. If adversaries are operating autonomously, a validation cycle that takes a human analyst a week is stale before it finishes. That is pushing vendors toward agentic approaches that assemble and run simulations in minutes rather than days.
It also raises a real safety concern. As Picus CTO Volkan Erturk has cautioned, pointing a raw generative model at "go write me an exploit" can produce live malware or hallucinated techniques no real group uses, neither of which you want detonating in production or shaping defenses against attacks that do not exist. The more defensible pattern keeps the model in a coordination role rather than a creation one: a multi-agent system reads a fresh threat report, gathers and validates intelligence, and maps the adversary's known techniques onto a curated, pre-vetted library of safe test components, rather than inventing payloads from scratch. The output is a scoped, runnable simulation assembled quickly, with humans reviewing exceptions instead of hand-driving every step.

Where this leaves defenders
None of this retires patching. Remediation is still the thing that actually removes risk, and nothing here changes that. What changes is the role patching plays. When flaws are discovered by the thousands and weaponized in hours, patch cadence alone cannot be the whole defense, because it physically cannot keep pace with disclosure volume.
What does scale with the threat is validation: continuously confirming what your controls stop, proving what is genuinely exploitable against you, and spending scarce remediation effort only where it changes the outcome. The practical takeaway for security leaders is less about adopting any single product and more about reframing the question. Stop asking only "what is vulnerable?" and start asking "what is exploitable against us right now, and would we catch it?" The first question generates a backlog. The second one generates a plan.
For teams wanting to ground this in current data, the Verizon DBIR and CISA's Known Exploited Vulnerabilities catalog are the most useful starting points, both for understanding how fast exploitation is moving and for pressure-testing whether your own remediation timelines are keeping up. The honest answer for most organizations is that they are not, and that gap is exactly where exposure validation is meant to earn its budget.

Comments
Please log in or register to join the discussion