Diagnosing a Slow Page

The methodology — not a tool tour. How a Lead goes from “the page feels slow” to a named root cause: RUM to find it, lab to explain it, the DevTools flame chart, and a per-symptom playbook.

1The one rule: measure, don't guess

The single thing that separates a Lead answer from a junior one: you never optimize blind. You find where the time actually goes, fix that, and verify it moved. The loop:

  1. RUM first — find the problem. Field data (real users, p75) tells you which metric, which page, and which segment (device, country, network) is slow. This is where you decide what's worth fixing.
  2. Reproduce in the lab. Open DevTools, throttle to match the slow segment (mid-tier mobile CPU 4×, slow 4G), and record. Lab is reproducible and detailed — it's where you find why.
  3. Localize to one metric/phase. Is it LCP (loading) or INP (responsiveness)? Then break that metric into its sub-parts (below) to point at one culprit.
  4. Fix the root cause, then verify. Re-measure in lab to confirm, then watch RUM to confirm it moved for real users — not just your machine.
One-liner

RUM to find it, lab to explain it. Field data tells me which metric and which users are slow; the DevTools flame chart tells me why. I fix the root cause and verify it moved in the field — never optimize on my own fast laptop.”

2RUM vs lab — two jobs, not rivals

RUM / field

finds & prioritizes

Real users via PerformanceObserver / a RUM vendor / CrUX. Real devices, networks, geographies. Answers whether you have a problem and for whom. Can't step into a single session.

Lab / synthetic

explains & reproduces

Lighthouse + DevTools Performance panel, one controlled run. Flame charts, waterfalls, automated audits. Answers why. But one synthetic device ≠ your real long tail (and can't see real INP).

Use them as a relay: RUM hands the lab a target. Optimizing a lab number that no real user hits is the classic waste. [MDN]

3Reading the Performance panel

Record a load (or an interaction) and you get a flame chart of the main thread. Three things to know:

main thread — flame chart (x = time →, y = call stack)
Long task ⚠ 180ms (red over 50ms)
task 40ms
idle
hydrate()
parseJSON
render
reconcile

Modern DevTools also has a Performance Insights view that auto-flags LCP sub-parts, render-blocking requests, third-parties, and duplicated JS — name it; it's the current workflow. [Chrome]

4The per-symptom playbook

Slow LCP → break it into 4 sub-parts

web.dev splits LCP into four phases — find which dominates, and the fix is obvious:

LCP sub-part What it is If it dominates → fix
TTFB server + network to first byte CDN, edge cache, faster backend (Lesson 03)
Resource load delay gap before the LCP image starts downloading it was discovered late → preload + fetchpriority=high (Lesson 01)
Resource load time how long the LCP image takes to download compress, AVIF/WebP, responsive srcset
Element render delay downloaded but not yet painted render-blocking CSS/JS → inline critical CSS, defer (Lesson 02)

Source: web.dev — Optimize LCP

Poor INP → record the interaction, split 3 phases

Interact while recording; DevTools breaks the slow interaction into input delay / processing / presentation (Lesson 03). Then:

Other tools to name

// minimal field measurement — feed your RUM
new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) sendToRUM(entry);
}).observe({ type: 'largest-contentful-paint', buffered: true });

But that snippet gives you raw entries, not the metric. In production you ship Google's web-vitals library (npm web-vitals, by the Chrome team) — it encodes the rules that match how CrUX measures, the very numbers you're graded on:

import { onLCP, onINP, onCLS } from 'web-vitals';

function sendToRUM({ name, value, rating, delta, id }) {
  // rating = 'good' | 'needs-improvement' | 'poor'  (the p75 thresholds)
  navigator.sendBeacon('/rum', JSON.stringify({ name, value, rating, delta, id }));
}
onLCP(sendToRUM); onINP(sendToRUM); onCLS(sendToRUM);

// the attribution build tells you WHY, in the field:
import { onINP } from 'web-vitals/attribution';
onINP(({ value, attribution }) => {
  // attribution.interactionTarget → the element the user hit
  // inputDelay / processingDuration / presentationDelay → which phase was slow
});
Trap — don't hand-roll it. Rolling your own PerformanceObserver RUM almost always mis-computes CLS session-windowing and INP, so your field numbers silently disagree with Google's CrUX. Ship web-vitals and your numbers match the ones you're ranked on — and the attribution build closes the “RUM to find it, lab to explain it” loop by pointing at the slow element/phase from the field itself.
One-liner

“I don't hand-roll PerformanceObserver — CLS windowing and INP are too easy to get wrong. I ship Google's web-vitals so my field numbers match CrUX, and attribution tells me which element to fix.”

Where the data lands — RUM vendors to name. The smart move isn't reciting brands, it's showing you know the categories and pick by constraint (cost, scale, privacy, build-vs-buy):

Free Google field data

the baseline everyone should know

CrUX — the dataset Search ranks on. Read it via PageSpeed Insights, the CrUX API, BigQuery, or the CrUX Dashboard (Looker Studio). Caveat: aggregated, 28-day rolling, origin/page-level — not per-session, so you can't debug one user. Free, zero-instrumentation.

Dedicated perf-RUM

when performance is the product

SpeedCurve (LUX), Akamai mPulse, DebugBear, Calibre, Raygun. Web-Vitals-first, p75 by route/segment, RUM + synthetic in one. Best signal-to-noise for a perf team.

APM / observability suites

when you already have one

Datadog RUM, New Relic Browser, Dynatrace, Sentry (Web Vitals + error/trace), Grafana Faro (open-source). RUM lives next to backend traces & errors — one pane, correlate front to back.

Roll-your-own pipeline

at scale, full control / cost

web-vitalssendBeacon → your data lake (or pipe to GA4, Cloudflare Web Analytics, Elastic APM). What a big org like the platform often does: own the pipeline, no per-event vendor cost, slice by any dimension.

One-liner

“CrUX is free but aggregated and lagging — fine for ranking, useless for debugging one session. For that I want session-level RUM: a perf-native vendor like SpeedCurve, or our existing APM (Datadog/Sentry), or at our scale a web-vitals→beacon pipeline we own.”

5The Lead move: systemic, not heroic

The toolkit grades root cause + systemic fix. So the answer isn't “I opened DevTools once” — it's a process and a guardrail:

Full loop

Concept: diagnose with RUM→lab→localize→fix→verify. Trade-off: deep lab profiling is time-expensive, so I let RUM prioritize what's worth profiling rather than chase every Lighthouse nit. Anchor: “A listing page regressed LCP; RUM pointed at p75 mobile in SEA, the flame chart showed a render-blocking experiment script — we deferred it and added a budget so it couldn't recur.” Impact: root-causing a class prevents a fleet of future regressions. Invite: “If we lacked RUM I'd start by instrumenting it — guessing from my laptop is how teams waste a sprint.”

6Check yourself — scenario quiz

Pick an answer; instant feedback. Push-back style, like the round.

1. A PM says “the site feels slow.” What's your first move?

2. In the Performance flame chart, what does a red triangle on a task mean?

3. LCP is 3.8s. The LCP breakdown shows resource load delay dominates — the hero image starts downloading very late. Best fix?

They want you to read the sub-part and map it to a lever.

4. Your Lighthouse run on a fast machine is green, but you suspect real users are slower. What's the gap, and the fix?

5. INP is poor. Recording an interaction shows the handler waits ~200ms to even start. Which phase and where do you look?

6. Best Lead framing of “how do you diagnose a slow page?”

7. You're asked to instrument field data for Core Web Vitals on the listing page. What do you do?

scn: a junior offers to write a quick PerformanceObserver for each metric.

8. “We already pull CrUX from PageSpeed Insights — why pay for a RUM vendor?” What's your answer?

scn: a PM is pushing back on a SpeedCurve/Datadog spend.

0 / 8 answered

Try this aloud before next session: “A teammate says our hotel listing page is slow. Walk me through exactly how you'd find the root cause — first move to last — and what you'd put in place so it doesn't regress again.” Time to 90 seconds.
Good follow-up topics:
“Quiz me out loud, harder” “Walk a real flame chart with me” “How do I set CPU/network throttling?” “Layout thrashing / forced reflow?” “What RUM vendors / how to roll my own?” “web-vitals attribution build — show me?”