The methodology — not a tool tour. How a Lead goes from “the page feels slow” to a named root cause: RUM to find it, lab to explain it, the DevTools flame chart, and a per-symptom playbook.
The single thing that separates a Lead answer from a junior one: you never optimize blind. You find where the time actually goes, fix that, and verify it moved. The loop:
“RUM to find it, lab to explain it. Field data tells me which metric and which users are slow; the DevTools flame chart tells me why. I fix the root cause and verify it moved in the field — never optimize on my own fast laptop.”
Real users via PerformanceObserver / a RUM vendor / CrUX. Real devices, networks, geographies.
Answers whether you have a problem and for whom. Can't step into a single session.
Lighthouse + DevTools Performance panel, one controlled run. Flame charts, waterfalls, automated audits. Answers why. But one synthetic device ≠ your real long tail (and can't see real INP).
Use them as a relay: RUM hands the lab a target. Optimizing a lab number that no real user hits is the classic waste. [MDN]
Record a load (or an interaction) and you get a flame chart of the main thread. Three things to know:
Modern DevTools also has a Performance Insights view that auto-flags LCP sub-parts, render-blocking requests, third-parties, and duplicated JS — name it; it's the current workflow. [Chrome]
web.dev splits LCP into four phases — find which dominates, and the fix is obvious:
| LCP sub-part | What it is | If it dominates → fix |
|---|---|---|
| TTFB | server + network to first byte | CDN, edge cache, faster backend (Lesson 03) |
| Resource load delay | gap before the LCP image starts downloading | it was discovered late → preload + fetchpriority=high (Lesson 01) |
| Resource load time | how long the LCP image takes to download | compress, AVIF/WebP, responsive srcset |
| Element render delay | downloaded but not yet painted | render-blocking CSS/JS → inline critical CSS, defer (Lesson 02) |
Source: web.dev — Optimize LCP
Interact while recording; DevTools breaks the slow interaction into input delay / processing / presentation (Lesson 03). Then:
content-visibility.
PerformanceObserver — the API your own RUM uses to capture LCP/INP/CLS/long-tasks from
real users.// minimal field measurement — feed your RUM
new PerformanceObserver((list) => {
for (const entry of list.getEntries()) sendToRUM(entry);
}).observe({ type: 'largest-contentful-paint', buffered: true });
But that snippet gives you raw entries, not the metric. In production you ship Google's
web-vitals library (npm web-vitals, by the Chrome team) — it encodes the rules
that match how CrUX measures, the very numbers you're graded on:
visibilitychange/pagehide.import { onLCP, onINP, onCLS } from 'web-vitals';
function sendToRUM({ name, value, rating, delta, id }) {
// rating = 'good' | 'needs-improvement' | 'poor' (the p75 thresholds)
navigator.sendBeacon('/rum', JSON.stringify({ name, value, rating, delta, id }));
}
onLCP(sendToRUM); onINP(sendToRUM); onCLS(sendToRUM);
// the attribution build tells you WHY, in the field:
import { onINP } from 'web-vitals/attribution';
onINP(({ value, attribution }) => {
// attribution.interactionTarget → the element the user hit
// inputDelay / processingDuration / presentationDelay → which phase was slow
});
PerformanceObserver RUM almost always mis-computes
CLS session-windowing and INP, so your field numbers silently disagree with Google's CrUX.
Ship web-vitals and your numbers match the ones you're ranked on — and the attribution build
closes the “RUM to find it, lab to explain it” loop by pointing at the slow element/phase from the field itself.
“I don't hand-roll PerformanceObserver — CLS windowing and INP are too easy to get wrong. I ship
Google's web-vitals so my field numbers match CrUX, and attribution tells me which element to fix.”
Where the data lands — RUM vendors to name. The smart move isn't reciting brands, it's showing you know the categories and pick by constraint (cost, scale, privacy, build-vs-buy):
the baseline everyone should know
CrUX — the dataset Search ranks on. Read it via PageSpeed Insights, the CrUX API, BigQuery, or the CrUX Dashboard (Looker Studio). Caveat: aggregated, 28-day rolling, origin/page-level — not per-session, so you can't debug one user. Free, zero-instrumentation.
when performance is the product
SpeedCurve (LUX), Akamai mPulse, DebugBear, Calibre, Raygun. Web-Vitals-first, p75 by route/segment, RUM + synthetic in one. Best signal-to-noise for a perf team.
when you already have one
Datadog RUM, New Relic Browser, Dynatrace, Sentry (Web Vitals + error/trace), Grafana Faro (open-source). RUM lives next to backend traces & errors — one pane, correlate front to back.
at scale, full control / cost
web-vitals → sendBeacon → your data lake (or pipe to GA4,
Cloudflare Web Analytics, Elastic APM). What a big org like the platform often does: own the
pipeline, no per-event vendor cost, slice by any dimension.
“CrUX is free but aggregated and lagging — fine for ranking, useless for debugging one session. For that I want
session-level RUM: a perf-native vendor like SpeedCurve, or our existing APM (Datadog/Sentry), or at our scale a
web-vitals→beacon pipeline we own.”
The toolkit grades root cause + systemic fix. So the answer isn't “I opened DevTools once” — it's a process and a guardrail:
Concept: diagnose with RUM→lab→localize→fix→verify. Trade-off: deep lab profiling is time-expensive, so I let RUM prioritize what's worth profiling rather than chase every Lighthouse nit. Anchor: “A listing page regressed LCP; RUM pointed at p75 mobile in SEA, the flame chart showed a render-blocking experiment script — we deferred it and added a budget so it couldn't recur.” Impact: root-causing a class prevents a fleet of future regressions. Invite: “If we lacked RUM I'd start by instrumenting it — guessing from my laptop is how teams waste a sprint.”
Pick an answer; instant feedback. Push-back style, like the round.
1. A PM says “the site feels slow.” What's your first move?
2. In the Performance flame chart, what does a red triangle on a task mean?
3. LCP is 3.8s. The LCP breakdown shows resource load delay dominates — the hero image starts downloading very late. Best fix?
They want you to read the sub-part and map it to a lever.
4. Your Lighthouse run on a fast machine is green, but you suspect real users are slower. What's the gap, and the fix?
5. INP is poor. Recording an interaction shows the handler waits ~200ms to even start. Which phase and where do you look?
6. Best Lead framing of “how do you diagnose a slow page?”
7. You're asked to instrument field data for Core Web Vitals on the listing page. What do you do?
scn: a junior offers to write a quick PerformanceObserver for each metric.
8. “We already pull CrUX from PageSpeed Insights — why pay for a RUM vendor?” What's your answer?
scn: a PM is pushing back on a SpeedCurve/Datadog spend.
0 / 8 answered