Blacksky releases its optimized AT Protocol implementation with performance fixes and community features, addressing scaling challenges that emerge at network scale.
Blacksky has open-sourced its fork of the Bluesky AT Protocol reference implementation, offering a production-tested alternative that addresses performance bottlenecks and operational challenges encountered when running a full-scale social network.

The repository contains two main components: a gRPC dataplane for PostgreSQL access and an HTTP API server for Bluesky's app.bsky.* endpoints. While the core remains compatible with upstream Bluesky, Blacksky's modifications target the specific pain points that emerge at network scale.
Why Build a Separate Implementation?
The upstream AT Protocol includes a TypeScript firehose consumer that indexes events directly from the network. At Bluesky's scale—approximately 1,000 events per second and 18.5 billion total records—this approach hits fundamental limits. A full network backfill using the built-in consumer would take over six years at typical indexing speeds.
Blacksky replaced this with rsky-wintermute, a Rust-based indexer designed for parallel processing. The service targets 10,000+ records per second through concurrent queue processing, making a complete network backfill feasible within weeks rather than years.
Architecture at Scale
The system separates concerns into independent processing paths:
- Ingester: WebSocket connection to Bluesky's firehose, writes events to embedded Fjall queues
- Indexer: Parses records from queues, writes to PostgreSQL with conflict resolution
- Backfiller: Fetches full repository CAR files from personal data stores (PDSes), unpacks records
- Label indexer: Processes moderation labels from WebSocket streams
This architecture ensures live events never block on backfill operations, a critical distinction for maintaining real-time functionality during initial synchronization.
Performance Optimizations
Several database-level improvements target common scaling issues:
LATERAL JOIN query optimization in timeline and list feed endpoints forces per-user index usage instead of full table scans. This dramatically improves performance for users following thousands of accounts.
Redis caching layer for actor profiles (60-second TTL), records (5-minute TTL), interaction counts (30-second TTL), and post metadata (5-minute TTL). However, a protobuf timestamp serialization bug currently prevents reliable caching—timestamps lose their .toDate() method after JSON round-tripping through Redis. The workaround is to serialize timestamps as ISO strings on cache write and reconstruct on read.
Notification preferences enforcement moves preference checking from client to server, ensuring saved settings actually affect which notifications users receive.
JSON sanitization strips null bytes (\u0000) and control characters from stored records before parsing. These characters are valid per RFC 8259 but rejected by Node.js's JSON.parse(), causing silent row-to-record failures that manifest as missing posts.
Community Features
Blacksky adds infrastructure for private community posts that live on the AppView rather than individual PDSes. This includes:
- Custom lexicon namespace
community.blacksky.feed.* - Separate
community_posttable with membership gating - Integration with
getPostThreadV2for mixed standard/community post threads - Requires separate membership database
Operational Challenges Solved
The repository documents numerous issues encountered during production deployment:
COPY text format JSON corruption: PostgreSQL's COPY text protocol treats backslash as an escape character. Without proper escaping, \" becomes " and records are silently corrupted. Blacksky found approximately 66,000 corrupted records and had to repair them by re-fetching from the public API.
Notification table bloat: Without unique constraints on (did, recordUri, reason), the notification table grows unbounded with duplicates. Blacksky's table reached 1.3 billion rows (663 GB) before adding conflict resolution.
Label negation ordering: During backfill, negation events (label removals) can arrive before the original labels, causing them to be silently dropped. The label_sync tool replays the full label stream to catch these cases.
Fjall queue poisoning: The embedded database used for wintermute's queues can enter a "poisoned" state after crashes, blocking all operations. The fix requires deleting the queue database directory and restarting.
Resource Requirements
Based on running a full-network AppView indexing all approximately 42 million users and 18.5 billion records:
- Minimum: 16 CPU cores, 64 GB RAM, 10 TB NVMe storage
- Recommended: 48+ CPU cores, 256 GB RAM, 28+ TB NVMe storage (RAID)
- Network: Sustained 100 Mbps minimum, 1 Gbps+ recommended
Storage breaks down to approximately 3.5 TB for posts and records, 2 TB for likes, 500 GB for follows, 600 GB for notifications, and 4 TB for indexes. OpenSearch for search adds another 500 GB.
Setup and Deployment
The implementation requires Node.js 18+, pnpm, PostgreSQL 17 with the Bluesky schema, and optionally Redis for caching. The dataplane applies migrations automatically on first run, with only one Blacksky-specific migration for community posts.
Key environment variables include database connection strings, gRPC and HTTP ports, DID configuration, and moderation service settings. The system supports read replicas for database scaling.
Licensing and Maintenance
Blacksky maintains the fork under the same dual MIT/Apache 2.0 license as upstream. The repository explicitly states it's not accepting contributions, issues, or pull requests, directing users to the canonical Bluesky implementation for those needs.
This release represents a practical case study in scaling decentralized social infrastructure, documenting the real-world challenges and solutions that emerge when moving from prototype to production at network scale.

Comments
Please log in or register to join the discussion