Jepsen: MariaDB Galera Cluster 12.1.2

MariaDB Galera Cluster 12.1.2 exhibits serious data consistency issues including lost committed writes, stale reads, and lost updates, failing to provide even basic snapshot isolation guarantees.

MariaDB Galera Cluster, an active-active replication system for MariaDB, promises strong consistency guarantees but fails to deliver them in practice. When MariaDB acquired Codership Oy in 2025, bringing Galera Cluster under its umbrella, the system continued to exhibit serious data consistency issues that were first identified in 2015.

The core problem stems from Galera's design choices around transaction certification. While the system claims to offer "Snapshot Isolation" and promises that transactions are "instantly replicated to all other nodes, ensuring no replica lag and no lost transactions," the reality is far more concerning.

Write Loss on Coordinated Failures

When configured with the recommended setting innodb_flush_log_at_trx_commit = 0, Galera Cluster regularly loses committed transactions during coordinated process crashes. In one test run, the cluster lost nine values appended to three different rows within a single minute. The transactions were acknowledged as successfully committed, but their effects disappeared when the cluster restarted.

This behavior occurs because Galera's synchronous replication isn't as synchronous as claimed. The documentation repeatedly states that transactions aren't truly committed until they've passed certification on all nodes, but in reality, the system continues operating when a minority of nodes has failed.

Setting innodb_flush_log_at_tx_commit = 1 significantly reduces but doesn't eliminate data loss. Even with this safer setting, MariaDB Galera Cluster occasionally loses the effects of committed transactions when process crashes and network partitions occur.

Lost Updates and Stale Reads

Even in healthy clusters without faults, Galera Cluster allows serious consistency anomalies. The system exhibits Lost Update (P4) violations, where one transaction's updates can be overwritten by another transaction that apparently modified data between the first transaction's read and write operations.

For example, if Transaction A reads a key, then Transaction B appends to that same key, Transaction A can later append its own value in a way that appears to overwrite Transaction B's changes. This violates the fundamental guarantee of Snapshot Isolation that transactions should see a consistent snapshot of the database.

Stale Reads are also common, occurring every few minutes even without fault injection. A transaction can commit and be acknowledged to the client, then a second transaction can begin and fail to observe the first transaction's writes. This directly contradicts Galera's claims of instant, lag-free replication.

Documentation Issues

The documentation makes it difficult to understand what consistency models Galera Cluster actually supports. While it claims to offer an isolation level "between Serializable and Repeatable Read," the system appears weaker than Read Uncommitted in practice.

The documentation's recommendation to use innodb_flush_log_at_trx_commit = 0 as a "safer, recommended option" with Galera Cluster is particularly problematic, as this setting allows data loss during coordinated failures.

Recommendations

Users should:

Set innodb_flush_log_at_trx_commit = 1 to reduce the probability of write loss on coordinated failure
Assume that committed transactions may not be visible to later transactions
Avoid read-modify-write patterns, as they are likely unsafe
Not rely on Galera Cluster for strong consistency guarantees

MariaDB should revise their documentation to clearly state that changing this setting to 0 allows data loss in Galera Cluster, and that even with the safer setting, users should expect occasional loss of committed writes when node failures and network partitions occur.

Future Work

The current analysis used simple append operations, but it seems likely that Galera Cluster would also exhibit Lost Update with blind writes to registers, potentially failing the simulated banking workload used in earlier Jepsen tests. Money could be destroyed or created out of thin air.

Other areas for investigation include predicates, slow networks, clock skew, and disk faults, all of which might reveal additional consistency issues.

The findings suggest that MariaDB Galera Cluster, despite its promises of strong consistency and synchronous replication, provides a consistency model weaker than Read Uncommitted, making it unsuitable for applications that require reliable transaction processing.

#MariaDB #Galera Cluster #Data Consistency #Snapshot Isolation #Transaction Certification