Reg: Reimagining Docker Registries with SQLite-Powered Metadata
Share this article
For developers managing container ecosystems, Docker registries like Docker Hub and Quay have long been black boxes. While they excel at storing and serving container images, their object-storage backbone makes simple queries—like listing all repositories or finding the largest layers—prohibitively expensive operations. As one developer laments: "Querying arbitrary information about stored images is either impossible or requires scanning massive S3 buckets."
Enter Reg, an experimental open-source OCI registry that replaces traditional metadata storage with SQLite. By decoupling metadata management from blob storage, Reg enables the kind of rich querying that DevOps teams have long desired while maintaining compatibility with existing S3-backed infrastructure.
The Metadata Bottleneck
The OCI distribution specification relies on HTTP APIs for image management, but implementations like Docker Distribution store everything—from image layers to manifest data—in S3-compatible storage. This design delivers scalability and durability at the cost of query flexibility. As the Reg developer explains: "It's cheap, it's persistent, it scales to infinity... but there's one major flaw."
Any non-trivial metadata operation requires scanning entire bucket prefixes, a process that becomes increasingly slow and expensive as registries grow. Want to find which images reference a specific layer? Prepare for a full bucket scan. Need to identify repositories with the most tags? Another scan.
SQLite as Metadata Engine
Reg's innovation lies in its hybrid architecture:
-- Simplified Reg metadata schema
CREATE TABLE manifests (id INTEGER PRIMARY KEY, digest TEXT);
CREATE TABLE tags (id INTEGER PRIMARY KEY, name TEXT, manifest_id INTEGER);
CREATE TABLE blobs (digest TEXT PRIMARY KEY, size INTEGER);
When images are pushed:
1. Blobs (image layers) go directly to S3
2. Metadata updates write to SQLite first
3. Changes propagate to S3 for durability
This "write-through" pattern makes SQLite the system of record for metadata while preserving S3 as the canonical blob store. The magic? SQLite serves as a high-performance query cache that understands relationships between images, tags, and layers.
Bootstrap and Recovery
Reg cleverly solves the bootstrapping problem: An existing registry's S3 bucket can rebuild the SQLite database through a one-time scan. While slow for massive registries, this enables drop-in replacement of existing implementations. For production resilience, Reg leverages:
- Turso's embedded replicas for SQLite synchronization
- Litestream for continuous backup
- Traditional
rsyncworkflows
Unleashing SQL Superpowers
With metadata in SQLite, previously impossible queries become trivial:
-- Top 10 repositories by tag count
SELECT repo, COUNT(*) AS tag_count
FROM tags GROUP BY repo
ORDER BY tag_count DESC
LIMIT 10;
-- Most reused layers across images
SELECT blob_digest, COUNT(DISTINCT manifest_id) AS usage_count
FROM manifest_blobs
GROUP BY blob_digest
ORDER BY usage_count DESC;
These queries execute in milliseconds rather than hours, unlocking new visibility into container ecosystems.
The Path Forward
Reg remains experimental but already supports core OCI operations:
- Image pushing/pulling via Docker clients
- Basic repository/tag listing
- S3-compatible storage backend
The project currently lacks HTTPS support (requiring --tls-verify=0 in clients), but its MIT-licensed codebase welcomes contributors. As container registries evolve beyond simple storage endpoints, Reg demonstrates how thoughtful metadata architecture can transform infrastructure from opaque data silos into query-ready knowledge graphs—proving that sometimes the most powerful innovations emerge from rethinking foundational layers.
Source: Write That Blog