SpacetimeDB's flashy launch and questionable benchmarks mask some interesting technical choices - but also serious limitations that position it more as a powerful Redis than a distributed database
The database market is notoriously difficult to break into. Launching a new product and differentiating yourself from established players is hard enough, but gaining long-term traction is even harder. Earlier this week, SpacetimeDB launched version 2.0 with a marketing approach that, as far as I can tell, hasn't been done before: a surreal, meme-y video mocking competitors while drinking their "tears," paired with benchmarks that seem too good to be true (and indeed, they are).
I'll be upfront: I find this marketing distasteful. But beneath the flashy exterior, there are interesting technical ideas worth examining. Let me walk through a fair technical review.
The Benchmark Problem
Newcomers to the database space often make the mistake of thinking they can win by having "the best performance." I've never seen this work in practice. The few companies that have built sustainable database offerings won by providing good, honest technical work that stands on its own merits. Benchmarks can help, but they need to be honest technical work themselves.
The benchmarks SpacetimeDB provided are neither good nor honest. They have several technical flaws in what they measure. You can see alternate benchmarks where SpacetimeDB performs poorly against competition. But the fundamental flaw is that they're not honest.
I understand the temptation. SpacetimeDB is fundamentally different from traditional databases - it's an all-in-one database plus application server where your code runs inside the database itself. This makes it very tempting to write unfair comparisons. It's like when PlanetScale shipped a MySQL extension for vector search that was fully transactional with data stored on disk. We could have shown it was "10,000 times faster than pgvector" by comparing it to in-memory approaches on the same machine, but that wouldn't have been honest. We chose to publish technical breakdowns instead.
What Makes SpacetimeDB Fast?
SpacetimeDB's impressive write performance comes from several factors. The obvious one is that application logic runs locally next to the database, making writes extremely efficient. They boost this further with batching and other tricks, but to achieve their numbers, they've made significant trade-offs.
The data store is in-memory, unlike traditional RDBMS. Writes are linearizable because the system is essentially a hash table with a lock in front of it. The committed state for the whole database is wrapped in a single Read-Write Mutex. All write operations happen sequentially, which trivially proves linearizability - two writes can't happen simultaneously, so they can't conflict.
But here's the catch: a read and a write cannot happen at the same time either! What happens if there are too many writes? Do readers starve? Building a data store on top of a single global lock is a valid technical choice, but if you're going all-in with this approach, you need explicit, customizable semantics for prioritizing readers and writers to ensure the server remains responsive under any workload.
In SpacetimeDB's case, the behavior is an implementation detail, not particularly defined or explained anywhere. They use parking_lot::RWMutex from the parking_lot crate, which has eventual fairness - readers will eventually acquire the lock even during high write-throughput scenarios, but they'll be randomly delayed up to 0.5ms.
The Critical Section Problem
While the global lock is held, a Wasmtime runtime executes "reducers" (arbitrary user code compiled to WebAssembly). During this execution, no other reducers can write to the database, and no code can read from it either. From their documentation, reducers "cannot perform HTTP requests" - and yes, that's exactly right. The critical section for all writes is exclusive and serialized, executing arbitrary user code. You'd better not be doing HTTP requests in the middle of it.
There's an escape hatch: "Procedures" in the server (still in Beta as of this week's release) allow running expensive code including HTTP requests. But from inside a procedure, you can open a transaction that again acquires the global mutex, so you need to commit very quickly or the whole system will stall.
For reads, the story is similar. They happen through "Views," which are read-only equivalents to reducers. Since they acquire a reader lock on the global mutex, several views can run concurrently, but the database cannot be written to while views are executing.
Durability Trade-offs
One obvious consequence of this single-mutex design is that you need to minimize work in the critical path. HTTP requests are out, but so are other "expensive" operations like persisting transactions to disk.
This fully in-memory database is backed by a Write Ahead Log, but the WAL is not committed to disk as part of the write transaction. It's asynchronous, flushed periodically to disk in the background (by default, every 50ms). Can you make this fully consistent? The single mutex design makes this complicated - the WAL can never be written synchronously without completely stalling all other writes and reads.
The system provides an option when reading with the withConfirmedReads flag, which allows reads to only return data that has been synced to disk by sleeping on the server until it sees the WAL entries flushed. This can be a sleep of up to 50ms - a long time for a request. It's not very ergonomic, but the assumption is that this is a database for "mostly ephemeral" data where highly consistent guarantees aren't needed.
This whole approach gives big MongoDB-2011 vibes. MongoDB launched with impressive benchmarks and a shitty database, got called out by the internet, and eventually acquired WiredTiger to become a serious database company. But the bad technical reputation lingers forever.
The Real Tradeoffs
These technical choices position SpacetimeDB as "a more powerful Redis," not "a more performant relational database." It's puzzling why they chose to benchmark against the latter.
This is not a distributed system and has very hard limits on scalability or availability. You can deploy a "SpacetimeDB cluster" with a primary instance and eventually consistent followers, but your whole system is bottlenecked by the CPU and RAM of the main instance. You need enough CPU for both database queries and application logic (which live inside the database). You need enough RAM to fit all your data in-memory.
If your dataset grows larger than RAM, your database (and application) will fail over. The only option for scalability is vertical: buying a bigger machine. These tradeoffs are valid, but they clearly position SpacetimeDB differently than how they're marketing it.
Use Cases and Future Potential
The original SpacetimeDB was developed as the backend for an MMORPG, which makes perfect sense. The technical choices fit this use case - asynchronously flushing a WAL entry that says a player looted an item, with 50ms delay being acceptable.
But the pivot to targeting LLMs seems like the worst possible technical choice for that use case. The whole shtick is that performance and availability are dominated by short segments of user code that cannot have side effects or stalls, because they're compiled to WebAssembly and executed inside a critical section that serializes all writes and reads.
The absence of side effects cannot be enforced by the type system and depends on the particular WASM bytecode generated at runtime. Any mistake inside these critical sections that could cause stalls under load will likely only be seen in production and will degrade your whole application's performance - probably to the point of causing an availability issue.
This is not the ideal environment for an LLM to program in.
Looking Forward
I think there's a product here and some lessons to learn. Perhaps the authors will apply these lessons to SpacetimeDB v3 and launch a more resilient, LLM-friendly database where application code is isolated and can run for as long as it needs without affecting other code; where transactions can run for as long as they need without affecting other transactions; where they're implicitly throttled if taking too long.
Perhaps we'll see a system that's much more resilient to failure but with less "impressive performance"; perhaps it will be trivially distributed so the AI agent doesn't have to plan a distributed system itself; perhaps it will launch with fewer silly benchmarks and more technical details.
That'd be a product worth watching.

Comments
Please log in or register to join the discussion