HTTP/1.1 → HTTP/2 → HTTP/3 (QUIC)

Multiplexing, head-of-line blocking, QUIC over UDP, compression, and flattening request waterfalls — the three-generation story of what changed and why it matters on the platform's mobile traffic.

1The one model: one bottleneck fixed per generation

Every HTTP version runs the same bargain — client sends requests, server returns responses — over a transport layer. What changed across three generations is which bottleneck each version fixed, and at which layer.

Version	Year	Transport	Key fix	Remaining problem
HTTP/1.1	1997	TCP	Persistent connections, pipelining (broken in practice)	One request in flight per connection → 6 connections per origin as workaround
HTTP/2	2015	TCP	Binary framing + multiplexing → many streams on one connection	TCP head-of-line blocking: one dropped packet freezes ALL streams
HTTP/3	2022	QUIC/UDP	Per-stream reliability → lost packet only affects that stream	QUIC CPU cost; UDP firewalls; observability harder

One-liner

"HTTP/1 had head-of-line blocking at the HTTP layer, HTTP/2 fixed that but not TCP's, HTTP/3 fixed TCP's by replacing it with QUIC. Each version solves a different layer of the same problem."

2HTTP/1.1 — one request at a time & the workarounds

HTTP/1.1 is a text protocol over TCP. The core constraint: each connection handles one request at a time. HTTP pipelining (send multiple requests without waiting for responses) exists in the spec but is broken in practice — responses must arrive in order, so one slow response blocks everything behind it.

How browsers worked around it

Browsers open ~6 TCP connections per origin. More connections = more parallel slots. This spawned the anti-pattern of domain sharding: splitting assets across static1.cdn.com, static2.cdn.com to get 12–18 parallel connection slots. It works, but costs one DNS lookup + TCP handshake + TLS handshake per shard domain.

HTTP/1.1 workarounds that become anti-patterns under HTTP/2:

Domain sharding — actively harmful under H2: fragments one H2 connection into many TCP connections, defeats multiplexing
JS/CSS concatenation — bundling everything into one file to save roundtrips; under H2, many small files with long cache TTLs beat one big file you must fully invalidate
Image sprites / data URIs — same logic as concatenation; H2 parallelism makes them unnecessary

3HTTP/2 — binary, multiplexed, still on TCP

HTTP/2 replaced the text protocol with a binary framing layer. Requests and responses are split into frames that interleave on a single TCP connection via independent streams. This is multiplexing: one connection, many parallel in-flight requests, no HTTP-layer head-of-line blocking.

What HTTP/2 delivers

Multiplexing

Core win

Many request/response pairs share one TCP connection, interleaved as frames. Eliminates HTTP-layer HoL blocking. Domain sharding becomes actively harmful.

HPACK header compression

Bandwidth

Static table of 61 common headers + dynamic table built during the connection. Repeated headers (Cookie, User-Agent) sent as 1–4 bytes instead of hundreds. Critical for HTTP APIs with large cookies.

Stream priorities

Control

Client can signal relative priority of streams. Critical CSS/JS can be weighted higher. In practice, browser resource hints (L01) are more reliable than H2 priority negotiation.

Server Push (deprecated 2022)

Gone

Server pre-sent resources before client asked. Fatal flaw: browsers couldn't check cache first → pushed content already in disk cache = wasted bandwidth. Replaced by 103 Early Hints.

The problem HTTP/2 didn't solve: TCP head-of-line blocking

TCP guarantees ordered, reliable byte delivery. If one packet is lost, TCP waits for retransmission before delivering any subsequent bytes — even bytes from completely unrelated streams. Because HTTP/2 puts all streams on one TCP connection, a single dropped packet blocks every stream simultaneously. On a lossy mobile network (1% packet loss is common across SEA), this can make H2 worse than H1.1's parallel connections.

Head-of-line blocking under packet loss — click to simulate

H1.1

Stream A (CSS)

Stream B (JS)

Stream C (image)

Blocked / retransmit

4HTTP/3 & QUIC — transport layer redesign

HTTP/3 (RFC 9114, 2022) solves TCP head-of-line blocking by replacing TCP entirely with QUIC — a multiplexed, encrypted transport protocol built on UDP. Each stream is tracked independently: a lost packet triggers retransmission only for that stream, while other streams keep flowing.

Per-stream reliability

Core fix

QUIC implements reliability and ordering per-stream in userspace, not globally. Drop a packet on stream B → only B waits. A and C flow uninterrupted. This is what kills TCP HoL blocking.

0-RTT connection resumption

Latency

Returning visitors send data in the first packet (0-RTT). TCP + TLS 1.2 needed 3 round trips; TLS 1.3 = 1 RTT; QUIC 1-RTT new + 0-RTT returning. Significant on high-latency SEA links.

Connection migration

Mobile

Connections identified by a Connection ID, not IP:port. Wi-Fi → 4G switch = same Connection ID, seamless handoff. No reconnect, no dropped booking form. Big win for the platform mobile users.

Built-in TLS 1.3

Security

QUIC mandates TLS 1.3. The QUIC handshake and TLS are unified — no separate TLS negotiation. Encrypted by default, including packet headers (makes DPI harder → QUIC-aware tooling needed).

QPACK — header compression for out-of-order delivery

HTTP/2's HPACK required headers to be decoded in order (another form of HoL blocking at the header layer). HTTP/3 uses QPACK: the same static + dynamic table approach, but adapted so headers referencing the dynamic table can be decoded even if earlier stream frames haven't arrived yet.

HTTP/3 trade-offs — the honest answer under push-back:

UDP firewalls — some corporate/ISP infrastructure blocks UDP port 443. H3 falls back to H2 via the Alt-Svc header (Alt-Svc: h3=":443"; ma=86400). Browsers retry over TCP automatically. ~30% of connections still fall back.
QUIC CPU overhead — reliability + congestion + encryption in userspace is heavier than kernel TCP. At the origin server scale, measure before adopting; CDN-terminated QUIC sidesteps this.
0-RTT replay risk — 0-RTT early data is replayable by attackers. Safe for idempotent GETs only; never for booking POSTs or mutations.
Observability — encrypted QUIC headers break traditional load-balancer log parsing. Requires QUIC-aware infrastructure.

5Compression: gzip vs Brotli

HTTP compression is negotiated via Accept-Encoding (client lists what it accepts) and Content-Encoding (server declares what it applied). Two algorithms dominate in 2024:

Algorithm	Ratio vs gzip	Compress speed	Decompress	Requires
gzip (level 6)	Baseline	Fast — viable for dynamic	Fast	Nothing; universal support
Brotli (level 11)	~20–26% smaller for text	~100× slower than gzip	Similar to gzip	HTTPS; all modern browsers
Brotli (level 4–5)	~10–15% smaller for text	~2–3× slower than gzip	Fast	HTTPS; viable for dynamic

The strategy

Static assets (JS, CSS, HTML): pre-compress with Brotli level 11 at build/deploy. Store the .br file alongside the original on the CDN. Serve with Content-Encoding: br to browsers that send Accept-Encoding: br. Dynamic SSR responses: gzip level 6 (good ratio, fast enough CPU); offer Brotli level 4–5 only if you have headroom. Never compress already-compressed formats — JPEG, WebP, AVIF, woff2, zip will grow, not shrink.

6Request waterfalls — and how to flatten them

A waterfall is a cascade of requests where each step must complete before the next begins. The classic LCP waterfall on a client-rendered app:

// ❌ Classic useEffect waterfall — each step adds 1+ round trips
HTML arrives
  → browser discovers JS bundle → downloads + executes
      → useEffect fires → fetch("/api/hotels")
          → response arrives → React renders hotel cards
              → hero image discovered → starts downloading
                  → LCP fires  ← 800–1200ms later on a slow link

Each arrow is a minimum of one round trip — on a 100ms RTT link (Bangkok → Singapore), that's 400ms of purely sequential waiting. The fix is not faster HTTP — it's making the steps parallel or eliminating them.

Colocate data with SSR

Biggest win

Fetch hotel data on the server during rendering. Data arrives inside the initial HTML response — zero client waterfall for critical content. RSC and Next.js route handlers make this ergonomic.

103 Early Hints

SSR + CDN

Server sends preload hints before the full 200 response is ready. Browser starts fetching CSS/fonts while SSR runs the hotel query. Cuts SSR processing time off the critical path for subresources.

Preconnect / preload

Resource hints

<link rel=preconnect> to API origin — DNS + TCP + TLS done before JS fires the fetch. rel=preload for the LCP hero image so it starts downloading immediately from HTML. (L01.)

Parallel queries

Data fetching

Promise.all([fetchHotelInfo(), fetchAvailability()]) instead of sequential awaits. RSC parallel routes and sibling <Suspense> boundaries fetch in parallel by default.

CDN edge caching

ISR pages

Serve pre-rendered or ISR-revalidated HTML from a CDN edge node near the user. Eliminates the origin round trip for cache hits — TTFB drops from hundreds of ms to single digits.

Speculative prefetch

Navigation

Speculation Rules API or rel=prefetch preloads the next page while the user reads the current one. Converts cold navigation to a cache hit — zero waterfall on click. (L01.)

7Platform whiteboard — the Lead answer

If asked "what's your network/HTTP strategy for a global travel platform?", structure the answer in three layers:

Protocol: HTTP/2 is table stakes — CDN support is universal, enable it. Layer HTTP/3 at the CDN edge (Cloudflare, Akamai support it) with automatic H2 fallback via Alt-Svc. No code change on the app. H3 matters most for SEA mobile users — high packet loss + connection migration during booking flows.
Compression: Pre-compressed Brotli (level 11) for all static assets in the CDN deploy pipeline. gzip level 6 for dynamic SSR. Never compress images or fonts — already binary-compressed.
Waterfalls: Hotel search results and hotel detail data colocated with SSR — no client fetch waterfall for critical content. 103 Early Hints for CSS/fonts from the CDN layer. Parallel queries with Promise.all / RSC parallel routes. Preconnect to API and CDN origins in <head>. Edge-cached ISR pages for hotel static content (description, photos). Speculation Rules for anticipated navigations (search results → hotel detail).

One-liner

"Enable H3 at the CDN edge — it's a flag flip, not a rewrite, CDN handles the H2 fallback. The bigger architectural win is eliminating the data waterfall: move API fetches into SSR so critical data arrives with the HTML."

Full loop

Concept: each HTTP generation fixes one bottleneck — H/1.1 added keep-alive but serialized requests, H/2 multiplexed over one TCP connection, H/3 moved to QUIC to kill TCP head-of-line blocking; Brotli then shaves 20–26% off text on top. Trade-off: H/3 needs UDP + edge support and Brotli level 11 is ~100× slower than gzip (build-time only) — so gzip stays the safer default on dynamic paths, and the real waterfall fix is architectural, not protocol choice. Anchor: "We flipped on H/3 at the CDN edge and added pre-compressed Brotli to the deploy pipeline — JS bundles dropped ~24% with no code change — but the bigger win was moving API fetches into SSR so critical data arrived with the HTML." Impact: fewer round trips + fewer bytes → faster parse/execute → lower LCP and INP, especially for users on slow connections in SEA markets. Invite: "I'd revisit dynamic Brotli and the SSR data strategy if we moved rendering to Edge Functions — the edge CPU cost model is different from an origin fleet at scale."

8Check yourself — scenario quiz

0 / 8 answered

1. A PM says "we've upgraded to HTTP/2 — we can remove domain sharding now, right?" What's the Lead answer?

2. An engineer says "HTTP/2 multiplexing means we've eliminated head-of-line blocking." Is that correct?

3. Why did Chrome deprecate HTTP/2 Server Push in 2022?

4. A user in Bangkok is booking a hotel. Halfway through the checkout form, their phone switches from Wi-Fi to 4G. Under HTTP/2 the connection drops. What does HTTP/3 do differently?

The checkout POST has not been submitted yet — they're still filling in the form.

5. Your team enables Brotli compression at level 11 for dynamic SSR API responses. Load tests show origin CPU spikes 40%. What happened and what's the fix?

6. A senior engineer reviews your Next.js hotel detail page and says the LCP (hero image) fires 900ms after HTML. DevTools shows a waterfall: HTML→JS bundle→useEffect fetch→render→hero image. What's the architectural fix?

The hotel data fetch in useEffect calls your internal API to get hotel photos, name, and description.

7. You're asked "should we move to HTTP/3 for our platform?" as a Lead, what's your answer?

8. A new engineer proposes solving your LCP waterfall by switching from gzip to Brotli and upgrading to HTTP/3. Your response as Lead?

The actual root cause is hotel search data fetched in a client-side useEffect, adding two round trips to the critical path.

Out-loud drill — do this before next session

Say this out loud without notes: "Walk me through HTTP/1 vs 2 vs 3 — what problem does each fix, and which matters most for the platform?"

Target: 60 seconds. Hit these: HTTP-layer HoL blocking (H1→H2), TCP-layer HoL blocking (H2→H3/QUIC), connection migration for mobile, Brotli pre-compress strategy, and the waterfall architectural fix being SSR data colocation, not protocol choice.

Good follow-up topics:

How does QUIC congestion control differ from TCP's? Show me a real HPACK encoding example How do I verify a site is using HTTP/3 in DevTools? What is the Alt-Svc header exactly? How does 0-RTT resumption work mechanically? How do I read a DevTools waterfall to find the root cause? Can Brotli be used with HTTP/1.1?

Lesson 16 of Interview prep. Reference card: cheatsheet/0016-http-versions-network-perf-cheatsheet.html

web.dev — HTTP/2 — multiplexing over one connection to remove app-layer head-of-line blocking.

MDN — HTTP/1.x connection management — keep-alive, pipelining limits, and head-of-line blocking.

MDN — QUIC — the UDP-based transport underneath HTTP/3.

RFC 9114 — HTTP/3 — the spec mapping HTTP semantics onto QUIC.

web.dev — WebTransport — low-latency bidirectional streams over HTTP/3.

Chrome for Developers — HTTP/2 Server Push deprecation — why push was removed and what replaces it.

MDN — Accept-Encoding — content negotiation for compression.

MDN — Content-Encoding — gzip/Brotli response compression.

web.dev — 103 Early Hints — preloading critical assets before the final response.

web.dev — Prerender pages — speculative full-page rendering for instant navigations.

Web Almanac 2024 — HTTP chapter — adoption stats for HTTP/2 and HTTP/3 across the web.

Written by Vikas Kumar Yadav · Tech Lead · thejsdeveloper.com