The iPad Was Fine. The Network Path Was Not.
#DevOps

The iPad Was Fine. The Network Path Was Not.

Trends Reporter
7 min read

A blank page on one iPad turned into a useful reminder that modern peer-to-peer software often fails where abstractions meet packet size, VPN routing, and quiet defaults.

Featured image

A small debugging story from p2claw is getting attention because it captures a pattern many developers recognize: the bug looked like a browser problem, behaved like a device problem, and turned out to be a network path problem.

The original post, The iPad was on Tailscale, describes a p2claw app that loaded on a Mac, a Linux box, and a phone, but hung on an iPad. The browser painted the initial loading state. The Service Worker registered. The WebRTC handshake completed. The data channel opened. Then the first request over that channel waited forever.

That combination is exactly why the story resonates. The visible failure was simple, a blank page. The real failure sat several layers lower, between webrtc-rs, Tailscale, IPv6 fragmentation behavior, and WebRTC’s reliance on UDP-like transport semantics.

The trend observation is larger than one iPad. Developer tools are increasingly built on tunnels, overlays, peer-to-peer links, local agents, browser transports, and self-hosted routing tricks. That stack is powerful, but it also puts application developers closer to packet-level behavior that HTTPS usually hides.

In this case, p2claw’s box agent sent WebRTC data channel messages through a Rust WebRTC implementation. webrtc-rs used a hardcoded SCTP MTU constant of 1,228 bytes. After encryption and headers, packets could exceed what Tailscale’s tunnel path could carry cleanly, especially over IPv6. The operating system fragmented the packets, which is allowed for IPv6 when done by the sender. The fragments then disappeared before reaching the receiving stack.

The p2claw team eventually found that Tailscale’s client classified IPv6 fragment packets as an unknown protocol, so they failed to match allow rules and were dropped under an ACL-related counter. The relevant Tailscale source is in the open at github.com/tailscale/tailscale, which made the diagnosis possible. The reported issues are webrtc-rs/webrtc#806 for the MTU constant and tailscale/tailscale#20083 for the IPv6 fragment drop.

The evidence is unusually instructive because the first theories were reasonable. WebKit was a believable suspect because all iOS browsers share WebKit. Message size limits were a believable suspect because WebRTC peers negotiate maximum message sizes. Wi-Fi was a believable suspect because intermittent packet loss often looks random. None of those held up.

The adoption signal here is not only that developers use Tailscale. It is that Tailscale, WebRTC, Rust networking stacks, and browser-hosted local tools are now normal ingredients in developer workflows. Tailscale has become a default private network layer for many teams because it makes machines feel local without traditional VPN setup. WebRTC has escaped video calls and now shows up in collaboration tools, local-first apps, remote agents, and peer-to-peer deployment experiments. Rust networking projects like webrtc-rs are attractive because they let developers build these systems without depending on a browser or a large C++ stack.

That popularity changes the bug profile. A decade ago, many web developers could assume TCP, TLS, and a cloud load balancer. TCP path MTU handling, MSS clamping, retransmission, and mature middlebox behavior absorbed many sharp edges. In newer peer-to-peer systems, developers often send application payloads over transports where packet size, fragmentation, candidate pair selection, and tunnel overhead matter again.

The p2claw case also shows how AI-assisted debugging can both help and distort. The post mentions Claude helping calculate a message cap and Anthropic’s Fable being used later to search earlier logs. That is a familiar new workflow: large traces, chat logs, screenshots, packet captures, and code references get pulled into an assistant-assisted investigation. The benefit is real. The risk is also real. Once the team had a WebKit theory, the Tailscale discovery was interpreted inside that theory instead of overturning it.

That is a useful warning for the current developer mood around AI tooling. Assistants are good at expanding a hypothesis. They are less reliable at forcing a hypothesis to lose status when new facts contradict it. The hard part of debugging remains deciding which evidence should change the story.

Technically, the core issue is path MTU. Every network path has a maximum packet size it can carry without fragmentation. VPNs and overlays reduce the usable payload because they wrap packets in extra headers. If an application sends packets that are too large, the network has to fragment them, reject them, or rely on the sender to discover a smaller size.

IPv4 and IPv6 differ here in ways that still surprise application developers. In IPv4, routers can fragment packets in flight. In IPv6, routers do not fragment packets. The sender may fragment, and the receiver is expected to reassemble according to the IPv6 specification, including RFC 8200. In the p2claw incident, the sending kernel created IPv6 fragments. The receiving OS never got them because the tunnel filter dropped them earlier.

WebRTC adds another layer. Data channels run over SCTP over DTLS over UDP. SCTP provides ordered reliable delivery for data channels when configured that way, but one missing chunk can block later data. That is why the app looked alive at the transport level but dead at the application level. Small packets, heartbeats, checks, and acknowledgements could still flow, while the actual payload was stuck behind a fragment that would never arrive.

The most useful diagnostic turn came from stepping outside WebRTC. A small ping to a Tailscale IPv6 address worked. A large ping that forced fragmentation failed. The same test over IPv4 worked. That reduced a confusing browser problem into a compact truth table. The full reproduction is linked at github.com/phact/mtu-webrtc-bug, including packet captures and notes on where packets disappear.

Community sentiment around this kind of post usually splits into three camps.

One camp sees it as validation for conservative packet sizing. If you are building real-time apps, multiplayer systems, remote agents, or peer-to-peer dev tools, assume the path is smaller than you think. Keep messages small, implement probing, watch receiver-side counters, and do not treat a sender-side packet capture as proof of delivery.

A second camp sees it as a Tailscale bug. From that view, silently dropping valid IPv6 fragments under an ACL counter is the part that turns a recoverable MTU mismatch into a hard-to-debug failure. The counter-argument is that many production networks treat fragments suspiciously because fragments can complicate filtering, inspection, and abuse handling. Dropping fragments is not rare. The objection is less about the existence of the policy and more about the invisibility of the failure.

A third camp sees webrtc-rs as the more actionable fix. A hardcoded MTU constant that cannot adapt to the path is brittle in VPN-heavy environments. The counter-argument is that the chosen value was not wildly irresponsible, and similar constants exist elsewhere. The failure only appears when combined with specific tunnel overhead, IPv6 selection, fragmentation behavior, and policy filtering.

That is the central lesson. Neither component had to be absurd for the combined system to fail. webrtc-rs trusted the network to carry or fragment its packets. Tailscale chose not to support IPv6 fragments in its filter path. WebRTC kept the connection apparently healthy because small control traffic still worked. The browser waited because ordered delivery blocked behind missing data. The app showed a blank page.

This is why the story is spreading among developers. It is not just a weird iPad bug. It is a clean example of modern software depending on layers whose failure modes do not compose nicely.

The practical takeaway is plain: if your software sends UDP, WebRTC, QUIC-like custom traffic, game state, tunnel traffic, or peer-to-peer payloads, treat MTU as part of the product surface. Log send sizes. Log receive sizes. Check getStats() when using WebRTC. Test over VPNs. Test over IPv6. Test with payloads large enough to force fragmentation. Most of all, test the route the failing device actually takes.

Consensus tends to blame the visible layer first. The browser, the device, the framework, the newest dependency. This incident argues for a colder habit: ask which path is unique, which packets are large, which counters stop moving, and which layer never got the chance to fail because a lower layer quietly discarded the evidence.

Comments

Loading comments...