MIT's Fractal Kernel Turns Microarchitecture Reverse Engineering Into Reusable Infrastructure

MIT CSAIL researchers built an operating system kernel from scratch whose only job is to study what processors actually do. Called Fractal, it boots on bare metal, strips away the measurement noise that plagues experiments run on macOS or Linux, and has already exposed speculative-execution behavior in Apple's M1 that earlier work missed, including the first evidence that Phantom attacks reach Apple Silicon.

Anyone who has tried to characterize how a modern CPU speculates knows the awkward part of the job. The processor is the thing you want to measure, but the only way to get close to it is to run your experiment on top of a general-purpose operating system that was built to do the opposite of what you need. macOS and Linux exist to hide the hardware, to multiplex it across processes, to inject scheduling and interrupts and address-space bookkeeping into every cycle. For most software, that abstraction is the whole point. For a security researcher trying to determine whether a Spectre or Meltdown-style attack is possible on a given chip, it is a layer of fog sitting directly on top of the signal.

A team at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) decided to stop working around that fog and instead build an operating system whose only purpose is to get out of the way. The result is Fractal, a kernel written from the ground up that treats the hardware itself as the object of study rather than something to be managed. Its first serious application, a detailed look at the branch predictors inside Apple's M1, has already surfaced behavior that prior published work either missed or got wrong.

Why the usual setup blurs the picture

To understand what Fractal changes, it helps to be concrete about the problem. A processor keeps internal state in a lot of places: branch predictors that guess which way a conditional will go, indirect branch predictors that guess the target of a computed jump, caches at several levels, translation lookaside buffers that cache address translations, and more. These structures are exactly where speculative-execution vulnerabilities live, because they carry information across boundaries the chip is supposed to keep isolated, the boundary between user code and kernel code being the most important one.

Studying that boundary means running nearly identical experiments on each side of it and comparing the results. The catch is that on a conventional OS you cannot hold everything else constant. Switching privilege levels normally means a system call or an interrupt, and the moment you do that, the kernel runs its own code, touches its own data structures, reschedules threads, and generally stamps its fingerprints all over the measurement. When the effect you are hunting for is a few cycles of timing difference in a cache, that background activity is not a minor annoyance. It is often larger than the signal.

The consequences show up in the literature. Results are hard to reproduce, baselines drift, and some findings turn out to be artifacts of the operating system rather than properties of the chip. On Apple's platforms the situation is worse still, because the kernel-patching tricks researchers have relied on are slated for deprecation, which means the standard methodology has an expiration date.

Inverting the model

Fractal flips the relationship between the experiment and the system. It boots directly on bare metal with nothing else running, so there is no scheduler competing for the core and no background kernel activity polluting the counters. On top of that clean foundation it exposes a set of primitives designed for one specific maneuver: letting a single experiment change privilege levels at runtime while executing the same instructions in the same address space.

The technique the team calls multi-privilege concurrency depends on a construct they introduced, the outer kernel thread. An outer kernel thread sits inside a user process's memory but executes with kernel privileges. That sounds like a contradiction, and in a sense it is the whole trick. It lets the experiment flip the privilege bit without changing anything else, no new address space, no context switch, no detour through system-call handling. The privilege level becomes what lead author Joseph Ravichandran, a PhD student in MIT's Department of Electrical Engineering and Computer Science, calls a true independent variable.

"You change the privilege level, nothing else changes," Ravichandran says. "The only thing that could explain whether the attack succeeds or not is the privilege level." His preferred analogy is optical. A hand magnifying glass shows you a little. An electron microscope shows you what is actually there. Fractal, he argues, is the electron microscope of operating systems. In practice that means flat baselines and clean signals where measurements under macOS or Linux are smeared by interrupts and scheduler noise.

Abstract illustration of a computer chip split up into layers over a neon grid in the background.

What turned up on the M1

The M1 implements an ARM feature called CSV2, which is meant to stop code in one privilege level from steering speculation in another. Using Fractal, the team confirmed that CSV2 does what it claims for the execute stage of indirect branch prediction. A user-mode program cannot make the kernel speculatively execute an attacker-chosen target through the indirect branch predictor. So far, the protection holds.

The interesting part is where it does not. The CPU still fetches the predicted target into the instruction cache before the protection engages. That fetch is observable through a side channel, which means user code can still influence what the kernel pulls into its caches across the privilege boundary, even though it cannot force the kernel to actually run the wrong code. The same leakage appeared between processes assigned different address-space identifiers. It is a gap between what the architecture promises and what the microarchitecture actually does, and it is precisely the kind of thing that gets lost when your measurement floor is noisy.

Fractal also produced the first evidence that Apple Silicon exhibits Phantom speculation, a misprediction class previously demonstrated only on AMD and Intel parts. In a Phantom event, ordinary instructions, including a no-op, get misinterpreted by the CPU as branches and trigger speculative behavior the program never requested. On the M1, Fractal showed Phantom fetches succeeding across both privilege levels and address spaces, with the execute phase still blocked. Same pattern as before: the fetch leaks, the execution is contained.

One result amounted to a correction of the record. Earlier work on the M1's conditional branch predictor had reported that cross-privilege training worked on Apple's performance cores but not its efficiency cores, an asymmetry that would have been genuinely strange. Fractal showed there is no such asymmetry. The conditional branch predictor has no privilege isolation at all on either core type. The earlier finding was most likely an artifact of macOS quietly migrating threads between cores during system calls, exactly the sort of invisible OS behavior that Fractal's bare-metal design eliminates. That is a good illustration of the broader argument: the tool did not just find new vulnerabilities, it explained away a phantom one created by the measurement apparatus itself.

Illustration of code, represented by zeros, transforming into question marks after traveling through a white layer onto a computer chip

Built to be used, not just demonstrated

What separates Fractal from a one-off research artifact is that the team built it as infrastructure. It supports x86_64, ARM64, and RISC-V, runs to more than 31,000 lines of code, and deliberately offers a familiar environment: POSIX system calls, a C library, and ports of standard tools including vim, GCC, and the dash shell. The goal is that a researcher with existing experiment code can move it over without rewriting everything, lowering the cost of adopting the platform.

That design choice is the difference between a paper and a tool other people actually use. Ravichandran's stated ambition is for Fractal to become to microarchitecture research what QEMU and FFmpeg are to their respective domains, shared infrastructure the whole community builds on rather than a result everyone admires and then reimplements badly. "My hope is that our results as a community get significantly more reliable, significantly more accurate," he says, pointing to the reduced noise and the guarantee that an experiment is running on the right core on the right system.

The outside assessment lines up with that framing. "Fractal is a strong architecture contribution because it turns an often ad hoc microarchitectural reverse-engineering workflow into reusable research infrastructure," says University of Southern California assistant professor Mengyuan Li, who was not involved in the work. "By reducing software noise and giving researchers tighter control across privilege boundaries, it makes difficult hardware experiments much easier to interpret."

In the foreground is a search bar with a magnifying glass icon and a lock icon. In a background are blueprints, a neural network made up of spheres and bars, and more copies of the search bar.

Limits worth being honest about

Fractal is a research instrument, and its strengths come with corresponding constraints. Running on bare metal with nothing else present is what gives the clean signal, but it also means the conditions are not those of a production system under real workload. An attack primitive that succeeds in Fractal's quiet environment still has to contend with contention, noise, and scheduling pressure to matter in the field. The M1 findings so far describe leakage through the fetch path rather than full speculative execution across the privilege boundary, which is a meaningful distinction for assessing real-world severity. The tool sharpens the question of what a chip is doing; it does not by itself settle how exploitable that behavior is in practice.

There is also the matter of where this sits in the disclosure cycle. The MIT team reported its M1 findings to Apple's product security team, and in a notable reversal Apple's own engineers examined Fractal in turn, a sign that the value of a cleaner measurement tool is recognized on both sides of the vulnerability conversation.

The larger pattern here is one that keeps recurring in hardware security. Speculative execution exists because waiting for certainty is slow, so processors guess and clean up later, and the cleanup is never quite perfect. Every generation of mitigation, CSV2 included, narrows the gap between architectural promise and microarchitectural reality without closing it entirely. Tools that can see into that gap with low noise and tight control are what keep the field honest. The work was supported in part by the National Science Foundation, the U.S. Air Force Office of Scientific Research, and the DARPA-sponsored ACE program, and Ravichandran carried it out with MIT associate professor Mengjia Yan. They presented the paper, "Fractal: An Operating System Designed for Microarchitecture Reverse Engineering," at the IEEE Symposium on Security and Privacy in San Francisco.

#microarchitecture-research #speculative execution #Apple Silicon #hardware-vulnerabilities #Security Research