TidesDB Outpaces RocksDB by 4x in Seek Performance Through Aggressive Caching Strategy
Share this article
Architectural Gambit: How TidesDB Achieves Radical Seek Performance Gains
When LSM-tree databases face point lookup operations, traditional designs often bottleneck on disk I/O. TidesDB architect Alex Gaetano Padula confronted this limitation head-on with a controversial design philosophy: sacrifice memory efficiency for near-zero-latency seeks. The resulting benchmarks reveal TidesDB v6.1.0 achieving:
- 4.17× faster random seeks (3.73M vs 893K ops/sec)
- 4.74× faster sequential seeks (8.87M vs 1.87M ops/sec)
- 5.25× faster Zipfian seeks (3.56M vs 679K ops/sec)
// Core cache-first seek path in TidesDB
tidesdb_ref_counted_block_t *rc_block = NULL;
tidesdb_klog_block_t *kb = tidesdb_cache_block_get(
sst->db, cf_name, sst->klog_path, cursor->current_pos, &rc_block
);
Code snippet showing TidesDB's atomic cache lookup – the foundation of its speed advantage.
The LSM-Tree Seek Bottleneck
Traditional LSM-trees like RocksDB optimize for write amplification, forcing seek operations through a gauntlet:
1. Bloom filter checks across SSTable layers
2. Disk-loaded block indices
3. Binary searches within indices
4. Block decompression
Each disk access adds milliseconds – catastrophic for latency-sensitive workloads. RocksDB mitigates this with caches, but TidesDB's solution is architectural: keep all metadata perpetually resident in memory.
TidesDB's Memory-Centric Arsenal
Persistent Metadata Residency
TidesDB retains until explicitly freed:
- Sampled block indices (prefix-compressed)
- Bloom filters (1% FPR)
- SSTable file handles
- Min/max keys for range filtering
Atomic Reference Counting
typedef struct {
tidesdb_klog_block_t *block;
atomic_int ref_count; // Lock-free concurrency
size_t block_memory;
} tidesdb_ref_counted_block_t;
Blocks stay alive during concurrent operations via atomic increments/decrements, eliminating lock contention.
Partitioned CLOCK Cache
- 2 partitions per CPU core (16 partitions on 8-core test hardware)
- CLOCK eviction replaces LRU for lower contention
- Zero-copy access for cache hits
Performance Tradeoffs Quantified
| Metric | TidesDB | RocksDB | Advantage |
|---|---|---|---|
| Random Seek Throughput | 3.73M ops/sec | 893K ops/sec | 4.17× |
| P99 Seek Latency | 5μs | 17μs | 70% lower |
| Memory Overhead | 851MB | 246MB | 3.46× higher |
The verdict: For modern servers with abundant RAM, the tradeoff is compelling. TidesDB consumes 851MB (vs RocksDB's 246MB) but delivers sub-microsecond median latency.
When The Tradeoff Makes Sense
TidesDB shines when:
- Point lookups dominate (OLTP workloads)
- Hot datasets fit in available RAM
- Latency sensitivity outweighs memory constraints
- Concurrent access demands scalability
RocksDB remains preferable for:
- Memory-constrained environments (e.g., 2GB containers)
- Large-range scan workloads
- Cold-data-dominant datasets
The Range Query Reality
While seek performance dazzles, range queries tell a different story:
- Small ranges (100 keys): 2.41× faster
- Large ranges (1000 keys): 1.11× faster (near parity)
- Sequential ranges: 1.14× faster
TidesDB's architecture optimizes positioning speed, not sequential scanning. When iteration dominates, advantages diminish.
The New Latency Frontier
TidesDB proves sub-microsecond storage engine latency is achievable through:
- Cache locality prioritization: Optimizing hit paths outweighs miss optimization
- Block-granular caching: Single cache entries serve hundreds of keys
- Memory economics: Accepting higher RAM usage for radical latency reduction
As Padula notes: "The difference between a cache hit (1μs) and miss (100μs+) is two orders of magnitude. On modern hardware, memory is cheaper than latency."
For seek-heavy workloads on capable hardware, TidesDB’s architecture demonstrates that abandoning traditional disk-centric constraints can unlock unprecedented performance – a lesson extending far beyond a single database engine.
Source: Benchmark analysis by Alex Gaetano Padula (Original Article)