Columnar databases excel at analytical queries but often stumble when faced with row-level updates. Traditional row stores like PostgreSQL handle mutations efficiently by overwriting data in-place, but this comes at the cost of analytical speed. ClickHouse, optimized for blistering-fast inserts and aggregations, initially avoided this trade-off entirely—until real-world demands for updates in IoT sensor feeds, inventory systems, and financial records forced innovation. The solution? Treat updates as inserts.

Article illustration 1

The Insert-Only Philosophy: Why Updates Are a Column Store Nightmare

In column stores, data is split into per-column files. An update affecting multiple columns requires rewriting fragments across these files, creating I/O bottlenecks. ClickHouse flips this model by exploiting its core strength: insert performance. With no global indexes to lock, inserts scale linearly, handling over 1 billion rows per second in some deployments. By recasting updates and deletes as new row inserts, ClickHouse defers the heavy lifting to background merges—leveraging the continuous consolidation already inherent in its MergeTree engine.

"We sidestepped the update problem by treating updates as inserts. This isn’t a workaround; it’s a deliberate design choice," explains the ClickHouse team. The approach avoids random I/O, instead using sorted, immutable data parts that merge efficiently in linear passes.

Foundation: Parts, Merges, and Sorted Efficiency

Every ClickHouse insert creates a self-contained part—a folder of column files, sorted by the table’s key (e.g., (order_id, item_id)). For example, an initial insert into an orders table:

INSERT INTO orders VALUES
    (1001, 'kbd', 10, 45.00, 0.00),
    (1001, 'mouse', 6, 25.00, 0.00);

Generates a part like all_1_1_0, storing rows contiguously. Background merges combine parts through single-pass interleaving (no resorting needed), making consolidation cheap. This efficiency enables lightweight handling of mutations during merges.

Purpose-Built Engines: Three Paths to Lightning-Fast Mutations

1. ReplacingMergeTree: Updates as New Versions

Define a table with ENGINE = ReplacingMergeTree and insert a new row with the same sorting key to update. Merges retain only the latest version:

CREATE TABLE orders (
    order_id Int32,
    item_id String,
    quantity UInt32,
    price Decimal(10,2),
    discount Decimal(5,2)
) ENGINE = ReplacingMergeTree ORDER BY (order_id, item_id);

-- Initial insert
INSERT INTO orders VALUES (1001, 'mouse', 6, 25.00, 0.00);
-- Update: Insert increased quantity
INSERT INTO orders VALUES (1001, 'mouse', 60, 25.00, 0.20);

During merges, outdated rows are discarded.


alt="Article illustration 2"
loading="lazy">

visualizes this process. #### 2. CoalescingMergeTree: Partial Updates with Null Magic Mark non-key columns as `Nullable` and insert only changed fields. Merges coalesce values into a complete row:
CREATE TABLE orders (
    order_id Int32,
    item_id String,
    quantity Nullable(UInt32),
    price Nullable(Decimal(10,2)),
    discount Nullable(Decimal(5,2))
) ENGINE = CoalescingMergeTree ORDER BY (order_id, item_id);

-- Partial update: Only quantity and discount change
INSERT INTO orders VALUES (1001, 'mouse', 60, NULL, 0.20);

alt="Article illustration 3"
loading="lazy">

shows how NULLs allow sparse inserts to merge into full rows. #### 3. CollapsingMergeTree: Deletes as Anti-Rows Add a `sign` column (e.g., `is_valid`). Insert `1` for new rows and `-1` to delete:
CREATE TABLE orders (
    order_id Int32,
    item_id String,
    quantity UInt32,
    price Decimal(10,2),
    discount Decimal(5,2),
    is_valid UInt8
) ENGINE = CollapsingMergeTree(is_valid) ORDER BY (order_id, item_id);

-- Delete: Insert a "cancelling" row
INSERT INTO orders VALUES (1001, 'mouse', NULL, NULL, NULL, -1);


Merges collapse matching +1 and -1 rows.

alt="Article illustration 4"
loading="lazy">

illustrates this—unique for supporting key changes.

Immediate Consistency with FINAL: No Merge Wait Needed

Background merges introduce eventual consistency. For real-time accuracy, use FINAL in queries:

SELECT * FROM orders FINAL;

This performs in-memory consolidation during reads, applying engine logic (e.g., keeping latest rows or collapsing deletes) without on-disk merges. Optimizations skip unnecessary work, ensuring speed even on large datasets.

The Trade-Off: Power vs. Complexity

While these engines handle high-throughput scenarios elegantly, they demand ClickHouse-specific modeling. Developers must understand sorting keys, merge semantics, and engine quirks—ReplacingMergeTree can’t update keys, and schemas often require Nullable columns or sign flags. Yet, the payoff is profound: updates become as cheap as inserts, sustaining ClickHouse’s analytical prowess. As the team notes, this insert-and-merge rhythm mirrors traditional row stores in outcome but achieves it through scalable, lock-free parallelism.

In Part 2, ClickHouse evolves further, wrapping this ingenuity into SQL-standard UPDATE syntax via patch parts—proving performance and familiarity can coexist.

Source: ClickHouse Blog