Architectural Gambit: How TidesDB Achieves Radical Seek Performance Gains

When LSM-tree databases face point lookup operations, traditional designs often bottleneck on disk I/O. TidesDB architect Alex Gaetano Padula confronted this limitation head-on with a controversial design philosophy: sacrifice memory efficiency for near-zero-latency seeks. The resulting benchmarks reveal TidesDB v6.1.0 achieving:

  • 4.17× faster random seeks (3.73M vs 893K ops/sec)
  • 4.74× faster sequential seeks (8.87M vs 1.87M ops/sec)
  • 5.25× faster Zipfian seeks (3.56M vs 679K ops/sec)
// Core cache-first seek path in TidesDB
tidesdb_ref_counted_block_t *rc_block = NULL;
tidesdb_klog_block_t *kb = tidesdb_cache_block_get(
    sst->db, cf_name, sst->klog_path, cursor->current_pos, &rc_block
);

Code snippet showing TidesDB's atomic cache lookup – the foundation of its speed advantage.

The LSM-Tree Seek Bottleneck

Traditional LSM-trees like RocksDB optimize for write amplification, forcing seek operations through a gauntlet:
1. Bloom filter checks across SSTable layers
2. Disk-loaded block indices
3. Binary searches within indices
4. Block decompression

Each disk access adds milliseconds – catastrophic for latency-sensitive workloads. RocksDB mitigates this with caches, but TidesDB's solution is architectural: keep all metadata perpetually resident in memory.

TidesDB's Memory-Centric Arsenal

Persistent Metadata Residency

TidesDB retains until explicitly freed:
- Sampled block indices (prefix-compressed)
- Bloom filters (1% FPR)
- SSTable file handles
- Min/max keys for range filtering

Atomic Reference Counting

typedef struct {
    tidesdb_klog_block_t *block;
    atomic_int ref_count;  // Lock-free concurrency
    size_t block_memory;
} tidesdb_ref_counted_block_t;

Blocks stay alive during concurrent operations via atomic increments/decrements, eliminating lock contention.

Partitioned CLOCK Cache

  • 2 partitions per CPU core (16 partitions on 8-core test hardware)
  • CLOCK eviction replaces LRU for lower contention
  • Zero-copy access for cache hits

Performance Tradeoffs Quantified

Metric TidesDB RocksDB Advantage
Random Seek Throughput 3.73M ops/sec 893K ops/sec 4.17×
P99 Seek Latency 5μs 17μs 70% lower
Memory Overhead 851MB 246MB 3.46× higher

The verdict: For modern servers with abundant RAM, the tradeoff is compelling. TidesDB consumes 851MB (vs RocksDB's 246MB) but delivers sub-microsecond median latency.

When The Tradeoff Makes Sense

TidesDB shines when:

  • Point lookups dominate (OLTP workloads)
  • Hot datasets fit in available RAM
  • Latency sensitivity outweighs memory constraints
  • Concurrent access demands scalability

RocksDB remains preferable for:
- Memory-constrained environments (e.g., 2GB containers)
- Large-range scan workloads
- Cold-data-dominant datasets

The Range Query Reality

While seek performance dazzles, range queries tell a different story:

  • Small ranges (100 keys): 2.41× faster
  • Large ranges (1000 keys): 1.11× faster (near parity)
  • Sequential ranges: 1.14× faster

TidesDB's architecture optimizes positioning speed, not sequential scanning. When iteration dominates, advantages diminish.

The New Latency Frontier

TidesDB proves sub-microsecond storage engine latency is achievable through:

  1. Cache locality prioritization: Optimizing hit paths outweighs miss optimization
  2. Block-granular caching: Single cache entries serve hundreds of keys
  3. Memory economics: Accepting higher RAM usage for radical latency reduction

As Padula notes: "The difference between a cache hit (1μs) and miss (100μs+) is two orders of magnitude. On modern hardware, memory is cheaper than latency."

For seek-heavy workloads on capable hardware, TidesDB’s architecture demonstrates that abandoning traditional disk-centric constraints can unlock unprecedented performance – a lesson extending far beyond a single database engine.

Source: Benchmark analysis by Alex Gaetano Padula (Original Article)