#Rust

CHERIoT Rust: Porting Rust to a Capability-Based Architecture

Tech Essays Reporter
4 min read

A detailed status update on the CHERIoT Rust project, documenting the progress of porting Rust to the CHERIoT platform over six months, including compiler modifications, address space handling, and recent bug discoveries.

The CHERIoT Rust project has reached a significant milestone after six months of intensive development, with the team successfully porting Rust to the CHERIoT platform. This comprehensive status update reveals the intricate work involved in adapting a mainstream programming language to a capability-based architecture, highlighting both the technical challenges and the engineering achievements along the way.

The project has accumulated 54 commits in its beta branch, touching 84 files with 805 insertions and 280 deletions. While these numbers alone don't tell the full story, they demonstrate the substantial modifications required to make Rust compatible with CHERIoT's unique architecture. The team's work builds upon the strict-provenance effort in Rust, which they credit for significantly easing their development process.

Foundation: LLVM Integration and Address Space Configuration

The initial work focused on establishing the fundamental infrastructure. The team updated the LLVM submodule to point to the CHERIoT port, addressing the critical difference in function signatures for memory operations. Specifically, functions for creating calls to memcpy and memmove needed to understand whether they were copying capabilities and whether metadata should be preserved.

A crucial early modification involved addressing the default address space configuration. Unlike traditional architectures that use address space 0, CHERIoT exclusively uses address space 200 for capabilities. This change was particularly significant because it could benefit other targets like amdgpu, and it required the compiler to interpret the datalayout string that specifies address space properties.

The team successfully created the riscv32cheriot-unknown-cheriotrtos target, marking the first step toward compiling code for CHERIoT. This seemingly simple achievement represented months of foundational work.

Pointer Width and Offset Challenges

One of the most complex aspects of the port involved reconciling CHERIoT's 32-bit addresses with 64-bit capabilities. While the address requires only 4 bytes, the capability structure includes 4 additional bytes of metadata that define what operations can be performed with that capability. This architectural decision fundamentally differs from most architectures Rust supports, where pointer size typically matches address size.

The solution involved leveraging the datalayout string's ability to specify pointer_offset values. This change affected numerous compiler components, such as the int_ty_max function, which previously used pointer_size() but now correctly uses pointer_offset(). This ensures that functions expecting maximum integer values return u32::MAX instead of u64::MAX for CHERIoT.

Core Library Integration and Testing

After establishing the basic infrastructure, the team focused on building the core library and running codegen-llvm tests. This phase involved splitting the size method of internal scalar representations into two distinct measurements: one for data capacity and one for memory storage size. They also added CHERI-specific intrinsics to core and adapted discriminant generation to account for pointer-address distinctions.

Bug Discovery and Resolution

A particularly interesting discovery emerged from investigating a formatting issue. The team found that code using the alternate format flag (e.g., {x:#b}) would crash with a PermitExecuteViolation error, while the standard format worked correctly. The investigation revealed a subtle bug in CHERIoT-LLVM's virtual register rewriter.

During tail calls, the register containing the jump address was being overwritten with another value, potentially causing runtime traps. The assembly analysis showed that while the code correctly loaded the function address into a register, it subsequently overwrote that register before the tail call executed. This discovery led to a quick fix in the CHERIoT-LLVM implementation.

Additional Improvements

The team made several other important modifications:

  • Created a new GitHub organization to host patched forks of third-party crates, including updated e_flags for CHERI and CHERIoT
  • Enabled AtomicPtr support by informing the compiler that the platform supports atomic operations on pointer-sized values
  • Fixed a bug in their custom test runner that prevented exceptions from being printed in specific cases

Current Status and Future Work

The project has reached a point where developers can build and test Rust code for CHERIoT using a straightforward process. The team has shifted focus from adding new features to verifying that generated code matches expectations and investigating discovered bugs.

The work demonstrates the flexibility of Rust's compiler architecture and the effectiveness of the strict-provenance effort. The team's ability to add a new target with CHERI-specific requirements using relatively few changes speaks to the quality of Rust's engineering practices.

For those interested in exploring the work, the team provides a one-liner command to clone and build the modified Rust compiler for CHERIoT. The project remains open for contributions and feedback through their public Signal group and GitHub repository.

This six-month journey represents a significant achievement in bringing modern systems programming capabilities to capability-based architectures, potentially opening new avenues for secure systems development.

Comments

Loading comments...