SPA vs. Hypermedia: Real-World Performance Under Load | zwei und eins gmbh | LavX News

A comprehensive performance comparison between SPA and hypermedia architectures for an AI chat application, revealing 7.5× faster mobile load times and 26× smaller bundle sizes under real-world network constraints.

CASE STUDY Michael Bolli · 25.02.2026 Measured February 2026 · Lighthouse + Chrome DevTools Performance Profiling · View source & raw data →

TL;DR We built an AI chat application using hypermedia architecture (sending HTML from the server) and compared it against a production SPA implementation (running a complex JavaScript app in the browser) under poor mobile network conditions. The simpler approach eliminates main-thread blocking entirely, is 26× smaller, and 7.5× faster to become interactive than the popular approach. The difference matters most where it counts: on phones with slow connections and limited battery.

But low complexity also brings other advantages: fewer dependencies to maintain, simpler deployment, and a dramatically smaller security surface to monitor.

Methodology: Apples to Apples

A fair comparison requires context. We compared our custom PHP/Swoole/Datastar hypermedia implementation against the Vercel AI SDK Chatbot—a production-grade reference implementation with 86 contributors and 600+ commits of optimization.

Why this comparison is valid

We chose the Vercel AI SDK because it represents the standard approach most teams would use when building an AI chat with Next.js. Our goal wasn't to benchmark the fastest possible SPA against the fastest possible hypermedia app, but to show what typical implementation choices produce. The Vercel SDK is well-maintained, widely used, and follows React/Next.js best practices—making it a representative baseline for real-world SPA development.

Both applications implement the same core features: real-time AI streaming, chat history, document artifacts, and authentication. Test conditions were identical: Chrome 144, Slow 4G network simulation (1.6 Mbps, 150ms RTT), and 4× CPU throttling to simulate mid-range mobile devices.

Lighthouse Scores: The Mobile Gap

On desktop with fast connections, both architectures perform reasonably well. The real differences emerge under constrained conditions—precisely the environment where most of the world's users operate.

Desktop (Unthrottled)

Metric	SPA	Hypermedia	Difference
Performance Score	93	100	similar
First Contentful Paint	0.5s	0.3s	1.7× faster
Largest Contentful Paint	1.6s	0.3s	5.3× faster
Total Blocking Time	110ms	0ms	much better
Time to Interactive	1.6s	0.3s	5.3× faster
Cumulative Layout Shift	0	0.001	~tie

Mobile (Slow 4G + 4× CPU Throttling)

This is where architectural choices become visible. The JavaScript bundle that's barely noticeable on a fast desktop connection becomes a critical bottleneck when the network is slow and the CPU is constrained.

Metric	SPA	Hypermedia	Difference
Performance Score	54	100	1.9× more
First Contentful Paint	1.6s	1.1s	1.5× faster
Largest Contentful Paint	8.1s	1.1s	7.4× faster
Total Blocking Time	780ms	0ms	eliminated
Time to Interactive	8.2s	1.1s	7.5× faster
Speed Index	3.9s	1.1s	3.5× faster

The SPA's Total Blocking Time of 780ms vs 0ms means the main thread is blocked for nearly a full second just parsing and executing JavaScript—time during which the page is visually present but completely unresponsive to input. The hypermedia version eliminates blocking entirely: its server-rendered HTML is immediately usable the moment it arrives.

What Gets Sent Over the Wire

Metric	SPA	Hypermedia	Ratio
HTTP Requests	36+	8	4.5× fewer
Transferred (compressed)	~1.1 MB	41.9 KB	26× smaller
Resources (uncompressed)	~4.5 MB	173 KB	26× smaller
JavaScript (transferred)	~1.1 MB	13.5 KB	80× smaller

The 26× reduction in transferred bytes directly translates to faster loads on slow networks. On a 1.6 Mbps connection, downloading 1.1 MB takes approximately 5.5 seconds before the browser can even begin parsing. 41.9 KB takes about 0.2 seconds.

Visual Proof: Where the CPU Time Goes

Performance profiling during actual page loads and chat interactions reveals how each architecture uses CPU resources. The flamegraphs below show Chrome DevTools Performance traces under identical conditions: Slow 4G network throttling and 4× CPU slowdown.

Scripting — JavaScript execution
Rendering — Style/layout calculation
Painting — Pixels to screen
System — Browser internals
Loading — Network activity
Idle — Available for user input
Memory — JS heap (bottom graph)

Page Load Comparison

Drag the slider to compare SPA (left) vs Hypermedia (right)

SPA vs. Hypermedia: Real-World Performance Under Load | zwei und eins gmbh

Page load flamegraphs under Slow 4G + 4× CPU throttling. Notice the dense, tall scripting blocks (yellow) in the SPA vs. the shallow, sparse execution in Hypermedia. The SPA's memory graph (bottom) shows JS heap growing +13.5 MB during initial load—Hypermedia grows only +1.5 MB (9× less).

Metric	SPA	Hypermedia	Ratio
Largest Contentful Paint	8.1s	1.1s	7.4× faster
Total Time	11.56s	2.45s	4.7× faster
Scripting Time	5,613ms	257ms	21.8× less
Main Thread Time	4,145ms	651ms	6.4× less
JS Heap Growth	+13.5 MB	+1.5 MB	9× smaller
Transfer Size	~1,107 KB	41.9 KB	26× smaller
DOM Nodes, final	326	963	3× more
Event Listeners	349	59	5.9× fewer

The key visual difference in the flamegraphs: the SPA shows tall, dense stacks of JavaScript execution (yellow blocks) that dominate the trace. Hypermedia's flamegraph is shallow and sparse—most of the time is spent idle or in native browser rendering, not executing JavaScript.

Note: Scripting Time and Main Thread Time are from different DevTools views—the Summary panel tallies all scripting across threads (including workers), while Main Thread Time is the aggregate task duration from the Activity tab. The two overlap but don't cover identical scope, which is why Scripting can appear higher than Main Thread Time.

The higher DOM node count in Hypermedia (963 vs 326) reflects server-rendered content that's immediately visible. The SPA starts with a minimal DOM and builds it client-side through hydration—which explains why it has 5.9× more event listeners attached despite fewer nodes.

Chat Response Comparison (Streaming)

Drag the slider to compare SPA (left) vs Hypermedia (right)

Hypermedia

Chat response flamegraphs. The SPA spends over 4× more time in JavaScript execution (10,500ms vs 2,485ms). Hypermedia distributes work more evenly—spending proportionally more time in rendering and painting as HTML patches hit the real DOM on every SSE chunk.

Metric	SPA	Hypermedia	Notes
LCP	206ms	35ms	6× faster
Scripting	10,500ms	2,485ms	4.2× less JS work
Rendering	1,274ms	2,225ms	Hypermedia patches real DOM on every chunk
Painting	538ms	1,240ms	More paint = more DOM content
Total	19,686ms	12,500ms	37% faster overall
JS Heap Growth	+10.4 MB	+1.9 MB	5.5× less memory
V8 Node Allocs ¹	394 → 822	3,164 → 34,559	Hypermedia high GC churn; SPA modest growth

During streaming, the SPA spends over 4× more time executing JavaScript than the hypermedia version (10,500ms vs 2,485ms). Hypermedia compensates with more rendering and painting time—because Datastar patches the real DOM on every incoming SSE chunk rather than buffering updates in a virtual DOM. The browser's native HTML parser does the heavy lifting, which shows up as rendering cost rather than scripting cost.

Overall Hypermedia completes the full chat response cycle 37% faster (12,500ms vs 19,686ms).

¹ V8 Node Allocs counts all nodes in the V8 heap—attached and detached (pending GC). Datastar replaces DOM patches on each SSE event, creating orphaned nodes that accumulate until garbage-collected; the live DOM after streaming completes is ~400–600 nodes. React reconciles mostly in-place, so far fewer detached nodes accumulate (+428 vs +31,395). This metric reflects GC pressure, not live DOM size.

Why Memory Matters

The 5.5× difference in memory growth (10.4 MB vs 1.9 MB per chat interaction) compounds over time. On mobile devices, high memory pressure triggers garbage collection pauses—visible as UI jank that users perceive as poor performance. Each spike in the memory graph represents potential frame drops.

React's reconciliation overhead is the primary driver: it maintains a virtual DOM tree in memory on top of the real DOM, and that overhead grows with each streamed token.

Streaming Efficiency: SSE Compression

Real-time AI chat relies on Server-Sent Events (SSE) for token-by-token streaming. The two architectures differ not just in compression, but in the fundamental connection model: Hypermedia keeps a single persistent SSE stream open for the whole session, while the SPA opens a new streaming POST for every message turn. That difference compounds dramatically over a conversation: after 10 turns, Brotli on a persistent stream achieves a 58.5× compression ratio—turning a 29× raw-data disadvantage into a 2× transferred-data advantage.

Single Turn (Claude 4.5 Haiku, same prompt)

Metric	SPA	Hypermedia	Notes
Endpoint	/chat (per-request)	/updates (persistent)
Compression	None ²	Brotli
Transferred	14.4 KB	6.0 KB	2.4× smaller
Uncompressed	14.4 KB	112 KB	Hypermedia re-streams full fragment per token (naïve impl)
Compression Ratio	1×	18.7×

Multi-Turn Conversation (10 turns, identical prompts)

Metric	SPA	Hypermedia	Notes
SSE connections	10 (one per turn)	1 (persistent)	Hypermedia reuses one open stream
Compression	None ²	Brotli	58.5×
Cross-turn repetition inflates ratio	Transferred (streaming)	~114 KB	55.8 KB
Uncompressed (streaming)	~114 KB	3,265 KB	Naïve impl re-streams full fragment per token
58.5× Brotli compression ratio after 10 turns on a persistent SSE stream

Worth being honest about what our PHP implementation actually does: every time the AI outputs a new word, it re-streams the entire message HTML fragment from scratch—not just the new token. Each chunk replaces the previous one in full. This is a deliberate simplicity choice, not a Datastar requirement: write the naive thing until you have a reason not to. It explains the 29× raw data gap.

Despite that self-imposed inefficiency, the combination of Brotli on a persistent connection beats the SPA over 10 turns. The compression more than compensates for the naïve implementation. The single-turn advantage (2.4×) already holds: Brotli's 18.7× compression ratio turns a 7.8× raw-size disadvantage into a net win. The multi-turn advantage grows further: a persistent stream lets Brotli compress all turns together, exploiting the cross-turn repetition in the chat HTML that builds up as the conversation grows. After 10 turns Hypermedia transfers half the data of the SPA—even starting from an implementation that wastes 29× more bandwidth before compression.

² Next.js streaming POST responses (/chat) have no Content-Encoding header. Enabling compression middleware on the streaming route would partially close this gap, but the per-connection overhead and inability to share a Brotli dictionary across turns would remain.

Codebase & Operational Complexity

Metric	SPA	Hypermedia	Notes
Dependencies (production)	799	69	11.6× fewer
Disk (node_modules/vendor)	793 MB	25 MB	31.7× smaller
Build Step	Required	No	—
Cold Start	1.85s	0ms	No serverless penalty
Hosting	Vercel (usage-based)	$20/year VPS	Predictable cost

Fewer dependencies mean a smaller attack surface, simpler security audits, and fewer breaking changes to manage during upgrades. The 31.7× reduction in production disk usage also translates to faster CI/CD pipelines and simpler container images.

The Economics of Simplicity

Technical metrics are interesting, but what do they mean for the business? Architecture decisions compound over time, affecting not just performance but development velocity, operational costs, and team productivity.

Features vs. Maintenance: Where Does Time Go?

Consider what 799 production dependencies mean in practice. Each dependency is a potential source of security vulnerabilities that need patching, breaking changes that need addressing, and compatibility issues that need debugging. With the JavaScript ecosystem's rapid release cycles, a significant portion of development time goes to keeping the dependency tree healthy rather than building features.

11.6× fewer dependencies to audit, update, and secure
0 build step means faster iteration and simpler deploys
$20/yr predictable hosting vs. usage-based pricing surprises

The Hidden Cost of Complexity

Complex architectures don't just slow down the application—they slow down the team. Every architectural layer adds cognitive load: developers must understand React's reconciliation, Next.js's rendering modes, the Vercel deployment model, and the interaction between client and server state. This knowledge takes time to acquire and maintain.

In contrast, a server-authoritative hypermedia architecture has one source of truth (the server), one rendering location (the server), and one deployment target (a simple process). New team members can become productive faster, and debugging follows a straightforward request-response model rather than tracing state through multiple layers.

Total Cost of Ownership

When evaluating architecture, consider the full picture:

Development costs: How much time is spent on framework-specific problems vs. business logic? How long does onboarding take?
Infrastructure costs: What's the monthly bill? How predictable is it? What happens during traffic spikes?
Opportunity costs: What features could have been built with the time spent on dependency upgrades and build pipeline maintenance?
Risk costs: How large is the attack surface? How quickly can security patches be deployed?

When Simplicity Pays Off

The economic case for reduced complexity is strongest wherever time and money are finite—which is most projects. A small team without a dedicated DevOps engineer or front-end specialist simply cannot afford to babysit a 799-dependency graph. When a transitive package ships a breaking change, it isn't an abstract inconvenience; it's a half-day of unplanned work that displaces a feature. The leaner the dependency tree, the fewer of those interruptions accumulate.

For applications expected to live years rather than months, this dynamic becomes even more pronounced. Initial development costs are a one-time investment, but maintenance compounds indefinitely. Every major framework upgrade, every deprecated API, every shift in the JavaScript ecosystem adds to the running total. Fewer dependencies and a simpler mental model keep that cost manageable—and institutional knowledge stays relevant longer.

Hosting economics follow the same logic. Usage-based pricing can absorb a modest traffic spike and return a billing surprise that forces an architecture review you didn't plan for. A predictable $20/year VPS doesn't do that. The cost is known in advance, scales to the team's budget, and keeps the focus on the product rather than on cloud spending dashboards.

Finally, there's the audience itself. The total elimination of blocking time (0ms vs 780ms) and the 7.5× improvement in Time to Interactive aren't benchmark curiosities—they determine whether a user on a mid-range phone with a patchy connection actually gets to use the product, or abandons it in the first ten seconds.

Limitations & Caveats

What This Article Doesn't Cover

Hardware specificity: Tests were conducted on specific hardware (AMD Ryzen 7 Pro 7840U desktop, emulated mobile). Results may vary on different devices.
Network emulation: Chrome's Slow 4G throttling simulates but doesn't perfectly replicate real mobile network conditions with packet loss and jitter.
Content types: This comparison focuses on an AI chat application. Performance characteristics may differ for other content types (e-commerce, dashboards, etc.).
Team expertise: Developers coming from SPA backgrounds must unlearn client-side state management patterns and embrace server-authoritative thinking—a mental shift that takes time.
Ecosystem maturity: The React/Next.js ecosystem has more tutorials, Stack Overflow answers, and hiring availability than hypermedia approaches.
Benchmarking is hard: Meaningful performance measurement is inherently difficult. Results vary across runs, throttling profiles approximate but don't replicate real-world conditions, and small methodology choices (cache state, viewport size, network profile) can shift numbers significantly. AI applications add another layer of variability: no two LLM responses are identical in length or timing, making streaming benchmarks particularly hard to reproduce exactly. Treat the ratios as directional, not absolute.
Feature completeness: The PHP port is a working proof-of-concept. Some features (file attachments, edit/regenerate) aren't implemented.

When to Choose What

Architecture decisions should be driven by requirements, not trends. Here's a framework for thinking about the tradeoffs:

CONSIDER HYPERMEDIA IF:

Content-heavy applications (documents, articles, dashboards)
SEO is important for your business
Global audience with varied network conditions
Real-time streaming is a core feature
You want predictable hosting costs
Team has backend expertise
Long-term maintainability is prioritized
You want to minimize security attack surface

CONSIDER SPA IF:

Offline-first is a hard requirement
Heavy client-side computation (image/video editing, CAD)
Complex drag-and-drop or canvas interactions
Existing React team with established patterns
Integration with React Native for mobile apps
Third-party component ecosystem is critical
Users have consistently fast connections

Many applications don't need the complexity of a full SPA. The question isn't "which is better" but "which tradeoffs match your constraints."

Summary

Category	Winner	Key Metric
Mobile Performance Score	Hypermedia	100 vs 54
Time to Interactive (mobile)	Hypermedia	1.1s vs 8.2s (7.5×)
Total Blocking Time (mobile)	Hypermedia	0ms vs 780ms (eliminated)
Page Weight	Hypermedia	42 KB vs 1.1 MB (26×)
JavaScript Bundle	Hypermedia	13.5 KB vs 1.1 MB (80×)
JS Execution Time (load)	Hypermedia	257ms vs 5,613ms (21.8×)
Memory per Interaction	Hypermedia	+1.9 MB vs +10.4 MB (5.5×)
SSE Compression	Hypermedia	Brotli 58.5× vs none; 2× less transferred over 10 turns
Dependencies (prod)	Hypermedia	69 vs 799 (11.6×)

Hypermedia wins all 9 measured categories outright.

These results don't mean SPAs are "bad" or that every application should use hypermedia. They demonstrate that the default choice of SPA architecture carries real costs that become significant under constrained conditions—and that alternatives exist.

Reconsidering your web architecture? At zwei und eins gmbh, we specialize in reducing complexity while improving real-world performance. These architectural tradeoffs are exactly what we optimize in production systems. Get in touch →

#spa #hypermedia #web performance #JavaScript #Server-Sent Events