CalyxOS's journey to implement HSM-based signing reveals critical security lessons about key management, backup strategies, and audit trails that extend far beyond Android ROM development.
The security of software distribution fundamentally depends on how cryptographic keys are managed, and CalyxOS's recent HSM-based signing redesign offers valuable insights into building resilient signing infrastructure. Their journey, detailed in a recent FOSDEM 2026 talk, demonstrates how thoughtful security architecture can address both technical and organizational challenges.
The core problem CalyxOS faced was straightforward but critical: traditional file-based key storage creates unacceptable risks. When signing keys exist as files on disk, anyone with access can make unlimited valid signatures, and there's no way to track where copies exist or who has them. This becomes particularly problematic for verified boot systems that establish trust chains for entire operating systems.
Hardware Security Modules (HSMs) emerged as the solution, providing tamper-resistant storage that makes key extraction extremely difficult even with physical access. While powerful actors might theoretically extract keys from HSMs, the security improvement over plaintext files is substantial. The real advantage becomes apparent during security incidents: if a CI machine gets compromised, you can revoke access to the HSM without regenerating all your keys.
However, implementing HSM-based signing proved more complex than simply swapping file storage for hardware. CalyxOS had to navigate multiple constraints around security, operational practicality, and transparency. Their evaluation process considered cloud-based solutions like Amazon Cloud HSM, enterprise appliances from Thales and Entrust, and more affordable options like Nitrokey NetHSM. They ultimately chose YubiHSM 2 for its balance of security and accessibility, while building in migration paths for future improvements.
A critical challenge emerged from YubiHSM 2's limited storage capacity. The device couldn't hold all the signing keys needed for CalyxOS's device-specific and component-specific signing requirements. This limitation affects many projects beyond Android ROMs, including F-Droid. The solution was key wrapping: storing keys encrypted outside the HSM and only decrypting them inside when needed. The wrap key remains securely inside the HSM, ensuring signing keys never exist in plaintext outside the secure module.
Backup strategies presented another complex challenge. Since the wrap key exists only inside the HSM, losing the device means losing access to all signing capabilities. Simple plaintext backups would defeat the entire security model. CalyxOS implemented Shamir's Secret Sharing, splitting the wrap key into five shards with a three-of-five threshold for reconstruction. This approach distributes trust across multiple people rather than concentrating it in a single individual.
The key provisioning ceremony required careful design to prevent key material leakage. Since YubiHSM 2 lacks native SSS implementation and even official utilities briefly hold keys in memory, CalyxOS developed a secure ceremony using TailsOS on randomly purchased hardware. The process included DVD-based initial data transfer, Windows-side hash verification using reproducible ISO files, and offline operation throughout. The ceremony was audited by Trail of Bits, adding an additional layer of confidence.
Android's signing complexity added another dimension to the challenge. The platform uses three different tools—apksigner, signapk, and OpenSSL—each with unique integration requirements. PKCS#11, the standard interface for HSM communication, worked with all tools but required significant customization. Performance optimizations became necessary, including batch modes for apksigner and disabling V1 signatures that required excessive HSM operations.
Audit capabilities proved essential for maintaining security beyond just protecting the keys. CalyxOS discovered that YubiHSM 2's audit log space was as limited as its key storage. They implemented tooling to intercept signing commands, flush logs to disk, and push them to append-only git repositories with automated hash chain verification. However, limitations remained: logs weren't cryptographically signed, verification required manual artifact retention, and the HSM's sparse logging information necessitated additional investigation steps.
The project's success relied heavily on open-source tools and community collaboration. CalyxOS open-sourced all their signing-related patches and documentation, inviting other custom ROM developers to learn from and contribute to their work. This transparency aligns with their mission to advance open-source technology while maintaining security standards.
Several broader lessons emerge from CalyxOS's experience. First, security infrastructure must balance technical requirements with operational practicality—the most secure solution is useless if it's too complex to implement or maintain. Second, backup and recovery strategies require as much attention as primary security measures. Third, audit capabilities are essential for detecting and responding to security incidents, not just preventing them. Finally, community collaboration and open-source development can accelerate security improvements across the entire ecosystem.
The CalyxOS signing redesign demonstrates that implementing robust security infrastructure requires addressing technical, operational, and organizational challenges simultaneously. Their journey from identifying the need for HSM-based signing to implementing a working solution offers a roadmap for other projects facing similar security requirements, while highlighting the ongoing work needed to achieve truly resilient signing infrastructure.
Comments
Please log in or register to join the discussion