Rethinking Kleppmann’s “Designing Data‑Intensive Applications” for AI‑First, Cloud‑Native Systems

The second edition of Martin Kleppmann’s seminal book arrives with new chapters on object storage, multi‑cloud deployment, and AI‑augmented data pipelines. Co‑author Chris Riccomini explains how the shift to cloud‑native architectures, embedded databases, and vector search reshapes the fundamentals that made the first edition a classic.

![Featured image]()

The book that defined data‑intensive systems

When Designing Data‑Intensive Applications (DDIA) hit shelves in 2017, it became the go‑to reference for anyone building databases, streaming platforms, or distributed caches. Its strength lay in focusing on timeless concepts—storage engines, replication, partitioning, and consistency models—while keeping the implementation details at a level that survived rapid language changes.

A decade later the underlying hardware and deployment model have shifted dramatically. Object stores now replace local disks for many workloads, serverless functions run alongside traditional VMs, and AI‑driven query patterns demand vector indexes and multimodal storage. The authors, Martin Kleppmann and Chris Riccomini, used the Monster Scale Summit 2025 panel to walk through why a second edition was finally inevitable and what readers can expect.

Why now? The forces that pushed the rewrite

Cloud‑native dominance – While the first edition mentioned cloud services, most examples assumed on‑prem or self‑managed clusters. Today, managed services (e.g., Amazon S3, Azure Blob, Google Cloud Storage) are the default storage layer for new applications. The new edition dedicates an entire chapter to the object‑store abstraction, explaining how built‑in replication, eventual consistency, and tiered pricing affect system design.
Edge and embedded deployments – Databases are no longer confined to data‑center racks. DuckDB, MotherDuck, SQLite, and the emerging PGlite bring SQL to laptops, browsers, and IoT devices. The book now treats the control‑data‑compute plane split as a design pattern, showing how a thin control plane can orchestrate compute on anything from a phone to a Kubernetes pod.
AI‑driven workloads – Vector search, Retrieval‑Augmented Generation (RAG), and multimodal embeddings have created new query shapes. The authors add sections on storage formats like LanceDB and Arrow‑based columnar layouts, and they discuss the trade‑offs of indexing high‑dimensional vectors versus traditional B‑trees.

Database architecture after a decade of change

From monolith‑disk to object‑store‑first

Kleppmann points out that the classic model—each node owning a local file system and replicating at the application layer—has been supplanted by a model where the storage service itself is the replication engine. Object stores provide durability, geo‑distribution, and lifecycle policies out of the box. The new edition evaluates when this model is advantageous (e.g., write‑once‑read‑many analytics) and when local SSDs still win (low‑latency OLTP).

The rise of “move‑with‑you” databases

Riccomini’s hypothesis that successful databases will follow the workload is now observable:

DuckDB / MotherDuck – Starts as an embedded analytical engine, then scales to a managed cloud service without changing the SQL dialect.
PGlite – A lightweight Postgres library that runs inside browsers or mobile apps, syncing with a remote Postgres cluster when connectivity returns.
SQLite – Remains the de‑facto embedded store, but extensions like sqlite‑vec bring vector search to the edge.

These examples illustrate a design where the control plane (metadata, authentication, scaling policies) is decoupled from the data plane (actual storage) and the compute plane (query execution). The book provides a decision matrix for architects to decide which plane to host where, based on latency budgets, regulatory constraints, and cost.

Streaming, SQL, and the incremental view frontier

Streaming platforms have traditionally exposed low‑level record‑by‑record APIs (Kafka consumer callbacks, Pulsar functions). The new edition argues that the next logical step is to treat streams as incremental relational views. Projects like Materialize let developers write a continuous SQL query and let the engine handle state, fault‑tolerance, and exactly‑once semantics.

Riccomini adds a practical note on the latency‑throughput trade‑off: batching improves throughput but adds latency, while micro‑batching (e.g., 10 ms windows) can give a sweet spot for many real‑time dashboards. The book includes a worked example comparing three configurations—pure record‑by‑record, micro‑batch, and full SQL view—showing how the same business KPI can be delivered with different latency guarantees.

AI as a first‑class participant in data pipelines

Kleppmann sketches a future where AI agents interact with databases through well‑defined command APIs. Instead of letting a model write arbitrary rows, the system exposes a limited set of operations (e.g., INSERT INTO orders WHERE …) that preserve consistency guarantees. This mirrors the emerging “LLM‑as‑a‑service” pattern where the model calls a tool rather than directly mutating storage.

On the storage side, the authors highlight LanceDB, a format built for multimodal data (images, audio, embeddings) that stores columnar Arrow batches alongside vector indexes. The chapter on dynamic indexing references a Google paper that used reinforcement learning to adapt B‑tree fan‑out based on recent query patterns—a glimpse of databases that learn their own physical layout.

Market implications: consolidation vs specialization

Riccomini, wearing his investor hat, predicts a dual market trajectory:

Postgres‑centric stack – Extensions like pg_vector, pg_search, and pg_duckdb let developers start with a single, familiar engine and add capabilities as needed. For many startups, this “one‑stop shop” approach reduces operational overhead.
Specialized services for scale – When workloads demand billions of vectors or sub‑millisecond latency, dedicated services such as Pinecone, Turbopuffer, or Weaviate become attractive. The book advises a migration path: prototype on Postgres, benchmark, then switch to a purpose‑built store if cost or performance thresholds are crossed.

What readers get from the second edition

Three free chapters – Introductory sections on object storage, edge‑first databases, and AI‑augmented query pipelines are available for download on the publisher’s site.
Updated diagrams – New visualizations of the control/data/compute split, and side‑by‑side comparisons of traditional replication vs. object‑store replication.
Practical case studies – Real‑world stories from ScyllaDB, Confluent, and WarpStream illustrate how SaaS vs. BYOC architectures affect multi‑tenancy, security, and cost.
Exercises – End‑of‑chapter problems now include designing a migration from a local‑disk PostgreSQL cluster to an S3‑backed analytics pipeline, reinforcing the concepts with hands‑on practice.

Where to watch the full conversation

The recorded panel from Monster Scale Summit 2025 is available on demand, followed by a follow‑up Q&A at the 2026 summit. Both videos include timestamps for the object‑store chapter, the edge‑database discussion, and the AI‑agent API design segment.

The second edition of DDIA does not try to rewrite fundamentals; it simply places them in the context of the cloud‑first, AI‑driven world we now live in. For architects who grew up with the first book, it offers a roadmap to modernize without discarding the solid mental models that have guided the industry for years.