Valkey's Hash Table Overhaul: Memory Efficiency Without Performance Tradeoffs

Valkey maintainer Madelyn Olson details how the Redis fork achieved 40% memory reduction in some workloads while maintaining performance through radical hash table redesign.

When Redis transitioned from its BSD license to proprietary licensing in 2024, a coalition of engineers from Amazon, Alibaba, Ericsson, Tencent, Huawei, and Google launched Valkey in just eight days. Eighteen months later, Valkey maintainer Madelyn Olson reveals how the team redesigned Valkey's core hash table structure - achieving significant memory savings while maintaining backward compatibility and avoiding performance regressions.

The Memory Efficiency Challenge

Valkey's original architecture dated back to 2009, optimized for simplicity rather than modern hardware capabilities. "We were doing lots of independent memory allocations," Olson explains. "When storing an object, we built container objects using linked lists to handle hash collisions, with relatively high load factors."

The team identified three key inefficiencies:

Separate allocations for keys and RedisObjects
Pointer-heavy linked list collision resolution
Suboptimal cache utilization

In production environments like Amazon ElastiCache, analysis showed median key-value pairs around 100 bytes - meaning pointer overhead could consume nearly 25% of memory. For users storing billions of objects, this translated to significant infrastructure costs.

Radical Restructuring

The Valkey team approached the overhaul in phases:

Phase 1: Slot-Centric Dictionaries (Valkey 8.0)

Replaced global linked list with per-slot dictionaries
Implemented binary index trees for cluster-wide sampling
Enabled efficient slot migration during horizontal scaling

Phase 2: Memory Consolidation (Valkey 8.1)

Embedded keys directly into entry structures
Collocated RedisObject metadata with entries
Reduced allocation count by 60%

Phase 3: Cache-Optimized Probing (Valkey 9.0)

Replaced linked lists with SwissTable-inspired buckets
Packed 7 entry pointers into 64-byte cache lines
Used SIMD instructions for parallel comparison checks

"We saved approximately 23 bytes per entry through these changes," says Olson. "For a customer with 8-byte keys and values, that translated to nearly 40% memory reduction."

Performance Validation

Maintaining Valkey's legendary throughput (250K requests/sec/core) was non-negotiable. The team employed multi-layered benchmarking:

Microbenchmarks: Isolated hash table operations
Throughput tests: valkey-benchmark at scale
CPU profiling: Perf counters for cache misses
Real-world sampling: Flame graphs for execution hotspots

"Surprisingly, our key-value workload showed no regression," Olson notes. "The aggressive prefetching we'd already implemented kept everything in L1/L2 cache. Some secondary workloads like set operations saw 20-30% improvements."

Migration Simplicity

Despite the under-the-hood changes, Valkey maintains drop-in compatibility with Redis 7.2. Cloud providers like Amazon ElastiCache, Google Memorystore, and Aiven offer one-click migrations. "Users report migrating with zero code changes," Olson remarks. "We're victims of our own compatibility success."

The Rust Question

When asked about rewriting Valkey in Rust, Olson offers a nuanced perspective:

"While I advocate writing new infrastructure in Rust, porting Valkey's optimized C code would be risky. We'd lose our dependency-free stance (current build: 10MB) and potentially regress on performance. Our module system already uses Rust for extensions like LDAP auth."

Future Directions

The Valkey Technical Steering Committee (representing the six founding companies) governs the project, with plans to expand membership. Ongoing work focuses on:

Vertical key scaling (1.4M requests/sec)
Enhanced observability
Plugin ecosystem growth

For developers exploring Valkey, Olson recommends:

Valkey Blog for technical deep dives
Slack community for real-time discussion
Cloud provider documentation for migration paths

"The hash table overhaul proves we can evolve core infrastructure without compromising performance," Olson concludes. "When you're processing millions of requests per second, every byte and cache miss matters."

#Valkey #Redis #Memory Optimization #hash table redesign #Performance