How Stripe built a custom zero-downtime data movement platform to scale their MongoDB-based database infrastructure to handle 5 million QPS with 5.5 nines of reliability, processing $1.4 trillion in payments annually.
Stripe's payment infrastructure handles an astonishing $1.4 trillion annually—equivalent to 1.3% of global GDP—with 5.5 nines of reliability. At the heart of this massive operation is their document database service, DocDB, built on open-source MongoDB, which processes over 5 million queries per second across more than 2,000 shards containing petabytes of financial data. In this deep dive, we'll explore how Stripe engineered a zero-downtime data movement platform that enables horizontal scaling, version upgrades, and multi-tenant migrations without disrupting their global payment processing.
The Evolution of Stripe's Database Infrastructure
Stripe's database journey began in 2011 with MongoDB as their primary datastore, chosen for its developer productivity and automated failover capabilities. Initially, product applications connected directly to a handful of MongoDB shards. As Stripe grew, so did their database fleet, expanding to tens of shards by 2017.
At this stage, maintenance operations like spinning up new shards, building indexes, replacing nodes, and resharding were handled manually through ad hoc scripts, creating a scaling and reliability bottleneck. To address this, Stripe developed a database proxy service that not only provided connection pooling but also served as a centralized enforcement point for reliability, scalability, admission control, and access control.
The exponential growth in online commerce—particularly among businesses with .ai domains starting in 2019-2020—presented new challenges. Stripe's database infrastructure needed to handle dramatically increased query volumes and storage requirements. While vertical scaling worked for a time, with some shards reaching tens of terabytes, they eventually hit physical limits and needed horizontal scaling.
Why Build DocDB In-House?
At various points in Stripe's 15-year history, they considered off-the-shelf Database-as-a-Service offerings. However, three critical factors led them to build DocDB in-house:
Security: For a financial platform like Stripe, security isn't just a feature—it's fundamental. Building DocDB allowed them to bake in security from the start through robust enforcement of authorization policies at the data layer.
Reliability and Performance: MongoDB, while powerful, has a large surface area in its querying capabilities that can lead to unintended performance issues. By exposing only a minimal, battle-tested set of functions to engineers, DocDB prevents these issues while offering true multi-tenancy with enforced quotas.
Scale: DocDB was designed for seamless horizontal scaling with sharding, ensuring Stripe's business needs could be met without compromising reliability or performance.
DocDB Logical Constructs
Product engineers interact with two main logical constructs in DocDB: logical databases and collections. A logical database serves as a container for one or more related collections. When creating these, engineers specify a shard key that determines how data is distributed across physical shards.
Under the hood, data in a logical database is sharded across multiple database servers by dividing the keyspace into contiguous chunks. For example, in the core_payments database's payment_intents collection, documents with shard keys 1-50 might reside on shard_a, while keys 50-100 live on shard_b. This chunk map is maintained in Stripe's routing metadata service.
Zero-Downtime Data Movement Platform
Horizontal scaling requires distributing data across more servers while maintaining availability and consistency. To achieve this, Stripe built a sophisticated zero-downtime data movement platform based on several key design principles:
- Consistency: Ensuring data remains complete and accurate during migration
- Availability: Maintaining service during the migration process
- Performance: Preserving throughput of shards involved in migration
- Granularity and adaptability: Supporting migration of arbitrary numbers of chunks of varying sizes
The Migration Process
The zero-downtime data movement follows a carefully orchestrated process:
Chunk Migration Registration: The intent to migrate chunks is registered in the routing metadata service.
Bulk Import: A point-in-time snapshot of the chunk is imported to the target shard. Throughput limitations during bulk loading were overcome by sorting data based on common index attributes before insertion, leveraging MongoDB's B-tree structure to boost write throughput by 10x.
Replication: Writes from the time of the snapshot are replicated to target shards using Stripe's CDC systems, not directly from MongoDB shards to avoid impacting source shard performance. This replication is bidirectional during migration to enable fast rollback if needed.
Correctness Check: Comprehensive verification ensures data completeness and consistency between source and target shards.
Traffic Switch: The most critical phase involves switching traffic from source to target shards without disrupting client applications.
The Traffic Switch Protocol
The traffic switch relies on a custom implementation of "version gating" with a modified version of MongoDB:
- Database proxy servers annotate requests with version numbers reflecting their routing metadata version
- The coordinator verifies replication is synchronized before proceeding
- The source shard's version number is bumped, causing requests with stale versions to be rejected
- After confirming all outstanding writes have been replicated, the coordinator updates the routing metadata
- Proxy servers discover the updated routes and redirect traffic to target shards
This entire process takes milliseconds to a maximum of 2 seconds, with failed reads and writes succeeding automatically on retries.
Beyond Horizontal Scalability
The zero-downtime data movement platform enables several critical operations beyond simple horizontal scaling:
Shard Splitting and Merging: Stripe can split any MongoDB shard n ways to achieve n-fold throughput and storage, then merge shards when utilization decreases. This is particularly valuable during peak traffic periods like Black Friday and Cyber Monday.
Database Version Upgrades: The platform allows Stripe to upgrade their entire fleet of 2,000+ database shards to new major versions without downtime. This significantly reduces the work required for upgrades and provides a fast rollback path if issues emerge.
Multi-tenant Migrations: Stripe can migrate databases between single-tenancy and multi-tenancy models as product needs evolve. Product teams typically start with multi-tenant databases and migrate to single-tenant infrastructure as their scale grows.
Key Engineering Insights
Several technical innovations make this system possible:
- Bidirectional Replication: During migration, data replicates both from source to target and back, with a custom MongoDB patch that tags replication service writes to prevent cyclic replication loops.
- Idempotency: The MongoDB oplog provides natural idempotency, allowing safe replay of write-ahead log entries.
- Eventual Consistency Handling: The routing metadata update is eventually consistent across hundreds of proxy servers, but version gating ensures stale proxies can't serve requests to the old shard.
Strategic Considerations
Stripe's approach offers valuable lessons for organizations facing similar scaling challenges:
Reliability is Non-Negotiable: Obsessing over API reliability has been a key differentiator for Stripe. Their data movement process is designed explicitly for zero downtime.
Invest in Strong Foundations: The zero-downtime data movement capability enables not just horizontal scaling but also database upgrades and tenant migrations—providing disproportionate value beyond its original purpose.
Build vs. Buy Decision Framework: Building capabilities in-house makes sense when they drive long-term strategic advantage, require unique reliability/security controls, or provide better long-term ROI while reducing vendor lock-in. Buying solutions makes sense for undifferentiated capabilities where third-party providers meet security and cost requirements.
Performance and Scale
Stripe's zero-downtime data movement platform achieves impressive performance metrics:
- Approximately 1.5-2 terabytes of data migrated per target shard per day
- Migration process takes milliseconds to 2 seconds for traffic switching
- Platform supports arbitrary numbers of concurrent migrations across the fleet
This infrastructure has been battle-tested, enabling Stripe to scale their database fleet while maintaining the strict consistency required for global commerce.
Conclusion
Stripe's DocDB and zero-downtime data movement platform represent a sophisticated approach to managing massive-scale financial data. By building custom solutions on open-source MongoDB, they've achieved the reliability and scalability needed to process trillions of dollars in payments annually while maintaining the flexibility to evolve their infrastructure as business needs change.
The platform demonstrates how thoughtful engineering can turn scaling challenges into opportunities to build more robust, flexible systems that serve multiple use cases beyond their original purpose. For organizations facing similar scaling challenges, Stripe's approach offers valuable insights into the technical and strategic considerations of building high-reliability data infrastructure.
For more information about Stripe's infrastructure, you can explore their engineering blog or watch the original presentation on InfoQ.

Comments
Please log in or register to join the discussion