Building Fast Eventual Consistency with CRDTs: Inside Fly.io's Corrosion System

Somtochi Onyekwere explains how Fly.io achieves millisecond-local and sub-second-global data replication using Conflict-free Replicated Data Types (CRDTs) in their Corrosion distributed system. The discussion covers practical conflict resolution strategies, gossip protocols, and why eventual consistency is often preferable to strong consistency for modern applications.

In this InfoQ podcast, Somtochi Onyekwere, software engineer at Fly.io, provides a deep dive into the architecture and design decisions behind Corrosion - Fly.io's distributed system for replicating SQLite data across global clusters. The conversation reveals how modern distributed systems can achieve both speed and consistency by relaxing traditional guarantees while maintaining data integrity through sophisticated conflict resolution mechanisms.

The Philosophy of Speed Over Strong Consistency

Fly.io's approach to distributed systems centers on a fundamental tradeoff: speed versus strong consistency. As Onyekwere explains, "We favor speed over consistency, and this has also shown up in how we build software." This philosophy emerged from practical experience - before Corrosion, Fly.io used Console, which employed Raft for strong consensus, but this approach didn't scale well for their use case of fast machine creation and deployment.

The key insight is that for most internet applications, eventual consistency works perfectly fine when you relax consistency guarantees. "You're like, okay, this data might be a little out of style and maybe I might make a sort of optimization, but I can do that quickly," Onyekwere notes. This approach enables response times measured in milliseconds for local operations and P99 latency of 1-2 seconds for global replication.

Understanding Conflict-free Replicated Data Types

CRDTs represent a class of data structures designed specifically for eventually consistent distributed systems. The core concept is that independent replicas can accept writes and update data without communicating with each other, yet when replicas exchange information, they converge to the same state regardless of the order in which changes were received.

Two Primary CRDT Approaches

State-based CRDTs: Nodes exchange their complete state and merge functions determine how to combine different states. For a grow-only counter, merging simply takes the maximum value. This approach requires more bandwidth but has fewer consistency requirements.

Operation-based CRDTs: Nodes exchange individual operations rather than complete state. This requires less network traffic but demands careful handling of duplicates and ordering. For example, with a counter, nodes would send increment operations rather than current values.

Practical Examples Beyond Counters

While counters provide an easy-to-understand example, CRDTs extend to more complex structures:

Grow-only sets: Elements can be added but never removed, eliminating duplicates
Observed-remove sets: Track both additions and removals using tombstones
Last-writer-wins: Uses logical clocks (like Lamport clocks) to resolve conflicts based on timestamps
Text editing: More complex CRDTs track character insertions and deletions at granular levels

Corrosion's Architecture and Implementation

Corrosion is built in Rust, chosen for its combination of speed and safety. The system uses several key technologies:

Tokio: Async runtime for handling concurrent operations
Rusqlite: SQLite interface for data storage
Hyper/Axum: HTTP API exposure
QUIC protocol: Replaces TCP for improved transport characteristics

The Gossip Protocol for Data Dissemination

Corrosion uses a gossip protocol for efficient data propagation. When a write occurs:

Local broadcast: Changes are immediately sent to nearby nodes
Epidemic spread: Each node forwards changes to 8 other nodes, creating exponential growth
Periodic sync: Background synchronization catches any missed updates

This hybrid approach ensures both speed and reliability. Local writes return in milliseconds, while the gossip protocol ensures eventual global availability.

Handling Stale Data

The system deals with stale data through several patterns:

Proxy routing: When an edge server receives a request for a machine that may have been deleted, it routes to the worker. If the worker has more recent state, it responds with an error, allowing the edge to reroute.

Owner-based writes: Each server owns specific machines, writing only to its own rows. This minimizes conflicts since different servers rarely write to the same data.

Retry logic: Applications can implement simple retries with delays matching expected replication times.

CRSQL: The SQLite Extension

The heart of Corrosion's conflict resolution is CRSQL, an SQLite extension that transforms regular tables into CRDT-compatible structures. When loaded:

Shadow tables: Creates clock tables tracking metadata for each row and column
Hooks: Intercepts insert, update, and delete operations to maintain metadata
Conflict resolution: Uses a three-tier tie-breaking system:
- Update frequency (more updates win)
- Deletion status (non-deleted wins over deleted)
- Node ID (deterministic fallback)

This columnar tracking means conflicts are resolved at the granular level, not per-row, enabling more precise merging.

When CRDTs Aren't the Answer

Onyekwere is clear about limitations:

Metadata overhead: Complex CRDTs like observed-remove sets require tombstone tracking, increasing storage requirements.

Ordering requirements: Applications requiring strict operation ordering (like financial transactions) need strong consistency.

Complex conflicts: When business logic requires custom resolution beyond standard merge functions, CRDTs may not fit.

Space constraints: The additional metadata may be prohibitive in resource-limited environments.

Local-First Software and CRDTs

The conversation connects Corrosion's approach to the broader local-first software movement. Local-first applications need to work offline while supporting collaboration. CRDTs enable this by allowing:

Immediate local writes without network dependency
Automatic conflict resolution when devices reconnect
Seamless collaboration across multiple devices

This pattern appears in text editors, note-taking apps, and collaborative tools where offline capability is essential.

Real-World Performance Characteristics

Fly.io's measurements show:

Local writes: Milliseconds (same as SQLite)
Global P99: 1-2 seconds
Network: QUIC provides advantages over TCP for distributed systems

The key is that "eventual consistency doesn't have to be very long." One to two seconds is often sufficient for user expectations, especially with retry logic.

Lessons for Distributed System Design

Several principles emerge from Corrosion's design:

Relax constraints strategically: Not every operation needs strong consistency. Identify where you can relax guarantees.

Design for conflict resolution: Structure data ownership to minimize conflicts (owner-based writes).

Hybrid protocols: Combine immediate local broadcasts with periodic global sync for both speed and reliability.

Application-level handling: Stale data is manageable through patterns like retries and proxy routing.

Use appropriate tools: CRDTs excel for specific use cases but aren't universal solutions.

The Future of CRDTs

Onyekwere sees growing interest in CRDTs, particularly for:

Text editing applications with collaborative features
Local-first software requiring offline capability
Distributed databases needing fast replication
Applications where network partitions are common

The technology is maturing with more research and practical implementations becoming available.

Conclusion

Fly.io's Corrosion demonstrates how modern distributed systems can achieve both performance and correctness by carefully choosing consistency models and implementing sophisticated conflict resolution. The combination of CRDTs, gossip protocols, and thoughtful architecture enables global-scale applications with sub-second local response times and reliable eventual consistency.

For teams building distributed systems, the key takeaway is that strong consistency isn't always necessary. By understanding CRDTs, gossip protocols, and conflict resolution patterns, you can design systems that are both fast and reliable, even in the face of network partitions and concurrent writes.

Mentioned in this episode: