Distributed systems still trip on old network assumptions
#DevOps

Distributed systems still trip on old network assumptions

Startups Reporter
3 min read

APNIC revisits the eight fallacies of distributed computing, a Sun-era checklist that still catches cloud teams, app developers and network operators.

Featured image

APNIC revisits the eight fallacies of distributed computing in a new blog post that traces the list from Sun Microsystems to modern cloud, mobile and content delivery networks.

The post gives engineers a useful test: check each design against loss, delay, cost, security and control before users find the weakness in production.

Bill Joy and Tom Lyon collected the first four fallacies at Sun Microsystems. L. Peter Deutsch added three more. James Gosling added the eighth. Sun's work on Unix, Java, NFS and networked workstations gives the list weight because those engineers built systems that pushed networked software into daily use.

APNIC frames the fallacies as engineering traps:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn't change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

Figure 1 — The eight fallacies of distributed computing.

The first fallacy still hurts application teams because developers treat a sent packet as a delivered packet. IP gives no delivery promise. Engineers use TCP, QUIC, retries, acknowledgments and idempotent operations because packets can drop, arrive late or arrive twice.

Latency adds another constraint. Distance, fiber paths, radio links, congestion and device buffers all add delay. Game developers and streaming teams face that cost first because users notice lag and jitter. Netflix-style buffering and error correction help because engineers assume delay and loss from the start.

Bandwidth creates the same trap at a different scale. A cloud service may sit near users through a content delivery network, yet a home router, Wi-Fi channel or mobile tower can still constrain the session. Engineers who design upload flows, video calls or backup tools need pressure controls, backoff and clear user feedback when the local link cannot keep up.

Security gives teams less room for wishful thinking. Packets cross providers, exchanges, routers and middleboxes outside one company's control. Developers protect users with TLS, sound key management and protocol design that limits metadata exposure. Traffic patterns can still reveal behavior through packet size and timing, so product teams should treat encryption as one layer in a broader privacy design.

Topology changes turn stable diagrams into guesses. A phone switches towers. A provider changes routes. A failover system shifts traffic. Network operators use BGP, VRRP, CARP and multipath tools to keep service alive, but application developers still see packet loss, delay and duplicate data at the edges.

The administrator fallacy cuts across cloud services. One team may run the app, another may own the network, a vendor may manage the load balancer and a provider may change routing policy. Good incident response names those owners before an outage, because confusion during a failure costs time.

Transport cost also deserves more attention. SMS makes the point in plain terms: sending a movie through text messages would cost an absurd amount, even though the protocol can carry small payloads. Cloud storage tiers show the same pattern. Amazon S3 Glacier can make retrieval cost more than storage because Amazon priced the service for rare access.

Homogeneity fails in small networks and global backbones. Ethernet, Wi-Fi, mobile data, satellite and undersea cables have different loss profiles, capacity and costs. A system that treats all links the same wastes money or gives users poor service.

Eight fallacies of distributed computing.

APNIC's post works because it pulls an old list into current practice without hype. The eight fallacies still describe the checks engineers need before they ship distributed software: retry design, timeout choices, observability, cost modeling, security boundaries and operational ownership.

Cloud teams can use the list during design review. Ask who handles retries. Ask which operation can run twice without damage. Ask who owns the network path during an outage. Ask what the user sees when bandwidth drops. Those questions catch fragile assumptions before customers do.

Comments

Loading comments...