CVE-2025-10263: A Critical TLBI Race Hits Nearly Every Modern Arm Core, From Neoverse N1 to NVIDIA's Olympus
#Vulnerabilities

CVE-2025-10263: A Critical TLBI Race Hits Nearly Every Modern Arm Core, From Neoverse N1 to NVIDIA's Olympus

Hardware Reporter
6 min read

A critical privilege escalation flaw rooted in TLB invalidation timing affects an enormous swath of Arm cores, from datacenter Neoverse parts to phone-class Cortex-A76 designs, and even NVIDIA's brand-new Olympus cores in the Vera CPU. Here is what the bug actually does, why the mitigation costs you extra barriers, and which silicon in your rack needs patching.

Arm published Security Advisory CVE-2025-10263 today, and the affected-parts list reads like a census of every relevant Arm core shipped in the last seven years. The flaw was assigned a CVE number last year but kept under embargo until now, with Linux kernel patches landing the same day. The short version: under a specific timing condition during a memory permission change, completion of an affected memory access may not actually be guaranteed by the completion of a TLB Invalidate (TLBI) instruction. That gap is enough to write to resources owned by a higher exception level, which is the textbook definition of privilege escalation.

If you run Arm servers, build homelab clusters on Ampere or Graviton-class silicon, or just care about what your SBC fleet is exposed to, this one is worth understanding properly rather than waving off as "apply updates."

What the bug actually is

To see why this matters, you need to understand the relationship between the TLB, page permissions, and the barriers that are supposed to keep them in sync.

The Translation Lookaside Buffer caches virtual-to-physical address mappings along with their permission bits (read, write, execute, and which exception level owns them). When the operating system changes a mapping, say revoking write access to a page or handing a page from EL0 up to EL1 ownership, it updates the page tables in memory and then issues a TLBI to evict the now-stale cached entry. To guarantee ordering, the kernel follows that with a DSB (Data Synchronization Barrier) so that subsequent instructions observe the invalidation as complete.

The architectural promise is that once the TLBI plus DSB sequence retires, no in-flight access can still be using the old permissions. CVE-2025-10263 breaks that promise. On affected cores, an affected memory access started before the permission change can complete after the TLBI reports done, using the stale, more-permissive mapping. An attacker who can line up the timing, hence the "specific timing condition" language in the advisory, can land a write against memory that now belongs to a higher exception level.

The practical attack shape is a race: trigger a permission downgrade or ownership transfer, and squeeze a write through the window before the core actually honors the invalidation. Win the race and you are writing into EL1 (kernel) or higher-owned resources from a lower privilege level.

The software mitigation, and why it is not free

Arm's prescribed workaround is blunt: any software performing TLB invalidation that applies to stage 1, or to stage 1 and stage 2 translation information, must perform an additional TLBI followed by another DSB. In other words, the existing single invalidate-plus-barrier is no longer sufficient on its own; you issue the sequence twice to close the window.

That is exactly what the Linux kernel patch series posted today implements. The cost is real if not catastrophic. TLBI and DSB are already among the more expensive operations in the kernel's memory-management hot paths, particularly on large multi-core systems where broadcast invalidations have to be acknowledged across every core. Doubling them in the affected paths adds latency to any workload that churns page permissions: think mprotect()-heavy applications, JIT runtimes that flip pages between writable and executable, copy-on-write fault storms after fork(), and KVM guests where stage-2 changes ripple through nested page tables.

Operation Before mitigation After mitigation
Stage 1 TLB invalidation TLBI + DSB TLBI + DSB + TLBI + DSB
Stage 1+2 invalidation TLBI + DSB TLBI + DSB + TLBI + DSB
Typical impact baseline extra barrier round-trip per invalidation

The overhead scales with how often your workload invalidates the TLB, not with raw compute. A number-crunching job that maps memory once and grinds will barely notice. A high-fork-rate service mesh sidecar, a database doing lots of madvise, or a busy hypervisor host is where you will want before/after numbers. I would benchmark mprotect micro-loops and a fork()/exec() heavy build workload on patched versus unpatched kernels before assuming the hit is negligible on your specific core.

The affected silicon is the real story

What makes this advisory notable is breadth. The list spans Arm's datacenter Neoverse line, the entire recent Cortex-X flagship series, and a deep bench of Cortex-A cores:

Neoverse (server / infrastructure): Neoverse V3 and V3AE, V2, V1, N2, and N1. The N1 alone underpins a huge installed base, including first-generation Ampere Altra and AWS Graviton2 class deployments, while V2 sits behind Grace and Graviton4-era parts.

Cortex-X (flagship mobile/client): Cortex-X925, X4, X3, X2, X1, and X1C.

Cortex-A (mainstream): Cortex-A710, A78 / A78AE / A78C, A77, and A76 / A76AE.

Newest cores: the latest C1-Ultra and C1-Premium are on the list too, so this is not a legacy-only problem that newer designs already dodged.

{{IMAGE:2}}

The span from N1 to the freshest C1 cores tells you this is a long-standing micro-architectural behavior rather than a one-off bug in a single generation. If a core implements the affected TLBI completion behavior, it needs the software mitigation regardless of how new it is.

NVIDIA's Olympus cores and Vera are affected too

A separate follow-up patch confirms the reach extends to silicon that has barely shipped: NVIDIA's new Olympus cores, the custom Arm cores powering the NVIDIA Vera CPU, are also vulnerable and are mitigated by the same approach. Vera is the CPU half of NVIDIA's Vera Rubin platform, the successor to the Grace Blackwell generation, so this lands right as those parts move toward broad availability.

NVIDIA Vera with Olympus cores

That NVIDIA had to post its own mitigation patch is a useful signal. Olympus is a custom core, not a licensed Cortex or Neoverse design, yet it shares the same TLBI completion semantics that trigger the flaw. Anyone planning Vera-based nodes should treat the mitigation as baseline-required firmware-and-kernel hygiene, not an optional tuning knob, and should fold the extra-barrier cost into early performance modeling rather than discovering it after the racks land.

Twitter image

What to actually do

For anyone running affected hardware, the playbook is straightforward:

  • Pull the kernel patches. The mitigation lives in the kernel's TLB invalidation paths. Once the series merges and backports to stable trees, your distribution kernels will carry it. If you build your own, the linux-arm-kernel list is where the series landed.
  • Identify your cores. lscpu and /proc/cpuinfo will tell you the part numbers. Cross-reference against the list above. Cloud instances are in scope too; Graviton and Ampere-based instances should be checked, and providers will typically handle the host-side hypervisor mitigation while you patch your guest kernels.
  • Benchmark the invalidation-heavy paths if your workload is sensitive: mprotect loops, fork/exec rates, JIT page flips, and KVM stage-2 churn. Capture before/after so you know your real exposure to the extra barrier cost.
  • Watch firmware. Some of the affected cores, especially on the NVIDIA and custom-silicon side, may distribute part of the mitigation through firmware or hypervisor updates rather than the kernel alone.

The combination of critical severity, trivially broad hardware coverage, and a mitigation that touches a genuine hot path makes this one to schedule deliberately rather than rubber-stamp. The privilege-escalation risk is the reason to patch quickly; the doubled TLBI sequence is the reason to measure while you do it. Full technical detail lives in the Arm security bulletin for CVE-2025-10263.

Comments

Loading comments...