How HTTP Caching Works Across the Browser, CDN, and Origin
HTTP caching is the set of rules that decides whether a request for a resource is answered from a stored copy or sent to the origin server — and which stored copy is allowed to answer. Modern web requests pass through four caches before they ever reach origin: the browser's private cache, an optional service worker, a CDN edge node, and any reverse proxy in front of the origin. Each layer reads the same Cache-Control header but applies different rules. Get those rules right and a page that hit origin in 350ms returns from edge in 12ms; get them wrong and a logged-in user sees someone else's dashboard.
Key takeaways
- Four caches sit between user and origin: browser, service worker, CDN edge, reverse proxy. Each can hold a different copy with a different freshness deadline.
Cache-Controlis the protocol's language for cache rules.max-agefor clients,s-maxagefor shared caches,immutablefor hashed assets,no-storefor nothing-ever-cacheable.ETag+If-None-Matchlets a cache revalidate cheaply — the response body never leaves origin again when nothing changed; the client just gets a304 Not Modified.stale-while-revalidatehides revalidation latency by serving the stale copy immediately and refreshing in the background. Supported in all modern browsers since 2020 and across Cloudflare, Fastly, KeyCDN, and Azure CDN with various levels of async behaviour.- The most common cache bug is missing
Vary(or ans-maxageon a personalised response) — a logged-in page gets cached on a shared cache and served to the next visitor.
The four cache layers, from client to origin
A request from a browser passes through caches in a fixed order. Each layer that has a fresh copy can answer; the request only travels further if every cache above it is stale or missing the resource.
User ── Browser cache ── Service worker ── CDN edge ── Reverse proxy ── Origin
(private) (private, scripted) (shared) (shared)
Browser cache is the disk-backed store inside Chrome, Firefox, Safari, or Edge. It is private — it can hold responses tied to a single user including authenticated content. Cache entries live until their freshness expires or the user clears them. Chrome currently caps the HTTP cache around 80MB to 320MB per profile depending on disk space.
Service worker is a scriptable proxy that runs in the browser, between page JavaScript and the network. It can cache anything the page touches, write the Cache Storage API directly, and serve responses offline. Unlike the HTTP cache, it does what its code says — there is no automatic freshness logic unless you write one. Useful for offline-first apps and PWAs, easy to misuse for everything else. A common pattern is stale-while-revalidate at the application level: respond from Cache Storage immediately, kick off a fetch() to update the cache, and re-render when it resolves. That's the same semantic as the HTTP directive, but expressed in JavaScript so the page can decide when to invalidate the cached copy.
CDN edge is a shared cache run by Cloudflare, Fastly, CloudFront, Akamai, Bunny, or similar. One response is reused for many users from a point of presence near the user. Because it is shared, the response must not contain user-specific data unless the cache is keyed by something user-specific.
Reverse proxy is the Varnish, Nginx, or HAProxy instance an origin team often puts in front of their own application servers. Same logical role as the CDN — shared cache, serves the response to many users — but inside the origin's own infrastructure.
Origin is the application server that actually runs the code that builds the response.
The interesting case is what Cache-Control lets you say to each of those layers independently.
Cache-Control — the directives that matter
Cache-Control is the single HTTP response header that carries caching policy in 2026. The old pair Expires + Pragma: no-cache still works on legacy clients but every modern cache reads Cache-Control first. RFC 9111 is the current spec — it superseded the older RFCs 7234 and 2616 in 2022 and is protocol-version agnostic, meaning the same directives behave the same way on HTTP/1.1, HTTP/2, and HTTP/3. HTTP/3 changes the transport (QUIC over UDP, QPACK header compression) but not the caching semantics.
The directives are read by every cache layer that understands them. Layers that don't recognise an extension directive silently ignore it.
| Directive | Who it targets | What it does |
|---|---|---|
max-age=N | Every cache (private + shared) | Fresh for N seconds from the time the response was generated. |
s-maxage=N | Shared caches only (CDN, proxy) | Overrides max-age for shared caches. The browser still uses max-age. |
public | Every cache | Explicitly cacheable by shared caches, even when the request was authenticated. |
private | Browser only | Shared caches must not store it. Use for any response keyed to a user. |
no-cache | Every cache | Response can be stored, but must be revalidated with origin before serving. |
no-store | Every cache | Never store. No copy on disk, no copy in memory. |
must-revalidate | Every cache | Once stale, must revalidate — cannot serve stale on error or any other reason. |
immutable | Browser | "This URL's content will never change." Skips revalidation on reloads. |
stale-while-revalidate=N | Every cache | After max-age expires, serve stale for N more seconds while revalidating in the background. |
stale-if-error=N | Mostly CDNs | If origin returns an error, serve stale for up to N more seconds. |
A working response usually combines a few of them:
Cache-Control: public, max-age=300, s-maxage=86400, stale-while-revalidate=60
That tells browsers to cache for 5 minutes, tells CDNs to cache for a day, and tells both layers they can serve a stale response for an extra minute while a fresh one is fetched in the background.
no-cache vs no-store — the confusing pair
These two trip up most developers. They sound similar; they do almost opposite things.
no-storemeans do not write this response to any cache at all. Every request goes back to origin.no-cachemeans you can store it, but you must check with origin before serving it. The response is cached, then revalidated with a conditional request (If-None-Match/If-Modified-Since) on every use.
no-cache is therefore close to "always revalidate" — and if origin returns 304 Not Modified, the cached copy is reused without re-downloading the body. no-store is the right answer for genuinely sensitive responses where even the disk copy is a risk; no-cache is the right answer for documents that change often but for which conditional requests are cheap.
immutable — the reload optimisation
Cache-Control: immutable was first shipped in Firefox 49 and is now broadly supported across Firefox, Edge, Safari, and Chrome (Chrome already optimised reloads to skip revalidation on subresources, then added explicit immutable parsing on top). It tells the browser that the resource at this URL will never change, so even when the user hits reload, the browser should skip the conditional revalidation step.
Use it only on URLs that are guaranteed not to change — typically hash-versioned assets like app.a3f9c7d2.js. The canonical pairing:
Cache-Control: public, max-age=31536000, immutable
A year of max-age, plus a promise the URL is content-addressed, plus zero conditional requests when the user navigates back to the page. For sites that ship a new bundle every day, this is the single most impactful caching header to add.
ETag and Last-Modified — revalidation without re-download
max-age answers the question "is this fresh?". When the answer is no — when a cached response has expired but is still in storage — the cache can either re-fetch the whole body from origin or revalidate and ask origin "did this actually change?". Revalidation costs a round trip but no bandwidth if nothing changed.
Two response headers enable it: ETag and Last-Modified. The cache sends them back to origin in the corresponding request headers If-None-Match and If-Modified-Since. Origin compares them to the current version of the resource and either returns the full new response (200 OK) or a tiny 304 Not Modified with no body.
The ETag flow
# First request
GET /api/users/42 HTTP/2
Host: example.com
# First response
HTTP/2 200 OK
Cache-Control: max-age=60
ETag: "a3f9c7d2-1729"
Content-Type: application/json
Content-Length: 1843
{...response body...}
Sixty seconds later, the cache is stale. The browser asks origin to confirm:
GET /api/users/42 HTTP/2
Host: example.com
If-None-Match: "a3f9c7d2-1729"
If the resource is unchanged, origin returns nothing but the headers:
HTTP/2 304 Not Modified
ETag: "a3f9c7d2-1729"
Cache-Control: max-age=60
No body. The browser reuses the cached copy and resets its freshness window. Total bytes over the wire: maybe 400 instead of 1843, and the round-trip is the only cost.
ETag is strong by default — it changes whenever the bytes change. The weak form W/"abc123" allows for semantically equivalent responses (e.g., different whitespace) to keep the same ETag, useful when origin can't byte-stable serialise its output but knows when the meaning changed.
Last-Modified — the older sibling
Last-Modified works the same way but uses a timestamp instead of an opaque token. Origin sends Last-Modified: Wed, 22 May 2026 09:00:00 GMT; the cache echoes it back as If-Modified-Since. Origin compares timestamps.
ETag is the better default because file-system timestamps are unreliable (replicas drift, container builds reset mtimes) and second-precision misses updates that happen within the same second. Use Last-Modified only when the resource is genuinely backed by a file with a trustworthy modification time and the legacy compatibility matters.
stale-while-revalidate and stale-if-error
RFC 5861 defines two extension directives that change what a cache can do after freshness expires. Both are now widely deployed.
stale-while-revalidate
Cache-Control: max-age=600, stale-while-revalidate=86400
The response is fresh for 10 minutes. Between 10 minutes and 24 hours, the cache may serve the stale copy immediately and revalidate in the background. The user never waits on the origin round-trip; the next user gets the refreshed copy.
Browser support landed in Chrome 75 and Firefox 68 (mid-2019) and is universal across modern browsers as of 2020 — see the MDN compatibility entry. On the CDN side, Cloudflare announced in February 2026 that its stale-while-revalidate implementation is now fully asynchronous — previously the first request for a stale asset waited on origin while later requests got the cached copy. Fastly has supported the directive natively for years; KeyCDN and Azure CDN support it with caveats around their revalidation behaviour.
stale-if-error
Same shape, different trigger:
Cache-Control: max-age=300, stale-if-error=86400
If origin returns a 5xx error or times out, the cache may serve the stale copy for up to a day. Browsers largely don't implement this — it's primarily a CDN directive. Fastly documents it explicitly with a default 12-hour fallback window; Cloudflare supports it on the free tier when Origin Cache Control is enabled.
The two directives compose well: stale-while-revalidate hides revalidation latency on the happy path, stale-if-error keeps the lights on when origin is down. A typical hardened API response:
Cache-Control: public, max-age=60, stale-while-revalidate=600, stale-if-error=86400
Fresh for a minute, gracefully revalidates for 10 minutes after that, survives an hour of origin outage with the last good copy.
The Vary header and the cache-key trap
A cache stores a response under a cache key — by default, the request method and URL. Two requests with the same URL get the same cached response. That breaks the moment your response varies by something else: the user's language, their auth state, whether they accept WebP, whether they're on mobile.
Vary tells caches "key this response by these additional request headers too":
Cache-Control: public, max-age=300
Vary: Accept-Language, Accept-Encoding
Now the cache stores one entry per (URL, Accept-Language, Accept-Encoding) triple. An English speaker and a French speaker hitting the same URL get different cached copies.
The trap is forgetting Vary when you should have it:
- A page that renders differently for logged-in users but doesn't include
Vary: CookieorVary: Authorization— and ispublic— will get one user's logged-in view cached at the CDN and served to everyone else who hits that URL. This is one of the highest-severity caching bugs in the wild. - A response compressed only when the client supports it but missing
Vary: Accept-Encodingwill serve gzip to a client that asked for identity.
The defensive default: any response that depends on a request header must list that header in Vary. For personalised responses, the safer move is Cache-Control: private (or no-store) — never cache them on a shared cache at all.
CDN behaviour here varies. Cloudflare ignores Vary for most fields by default and offers a "Cache by Device Type" rule for the common case. Fastly supports Vary natively. CloudFront keys by whatever the cache policy specifies. Read your provider's docs before relying on Vary alone.
Cache busting — when freshness windows aren't enough
If app.js is cached for a year and you ship a new build, the new version won't reach users for a year. The fix is to change the URL every time the content changes. Three patterns are common.
Content-hashed filenames are the modern default. The bundler emits app.a3f9c7d2.js; when content changes, the hash changes, and the URL is new:
<script src="/static/app.a3f9c7d2.js"></script>
Pair these with Cache-Control: public, max-age=31536000, immutable. Users get the old file from disk forever — until the HTML references the new hash, at which point they download once and then cache that for another year. Vite, Webpack, esbuild, Turbopack, and Parcel all emit hashed filenames by default in production builds.
Version query strings (/app.js?v=123) work in a pinch but some intermediaries strip or ignore query strings in cache keys. They're a fallback, not a default.
Path versioning (/v3/app.js) is the cleanest for APIs and SDKs where you want explicit version control rather than content-hash invariance.
The HTML document itself — the one that references all the hashed assets — must not be cached aggressively, because that's the file whose change triggers everything else. A typical pattern:
# HTML document
Cache-Control: public, max-age=0, must-revalidate
# Hashed JS/CSS/images
Cache-Control: public, max-age=31536000, immutable
The HTML is always revalidated (cheap, with ETag); the assets it references are never revalidated.
Common caching bugs and how they break things
Five bugs cover a disproportionate share of "the cache is wrong" tickets.
Logged-in pages cached on the CDN. The page renders the user's name in the header and gets served with Cache-Control: public, max-age=300. The next visitor to that URL — or every visitor, depending on the CDN — sees the original user's session. Fix: personalised responses are Cache-Control: private, no-store or, if cacheable, must Vary: Cookie (and the CDN must respect it).
Stale 304 with a wrong ETag. Origin generates an ETag from a property that doesn't change when the body does — for example, the database row's updated_at while the response also includes a translated string whose source changed. Browsers receive 304 Not Modified and keep the stale body forever. Fix: derive ETag from a hash of the actual response body or from every input that contributes to it.
no-cache confused with no-store. A developer sees that secrets are leaking into a cache and adds Cache-Control: no-cache. The response is still stored — no-cache only forces revalidation. Fix: no-store for anything sensitive.
Compression mismatch. Origin gzips a response only when the client supports it but doesn't send Vary: Accept-Encoding. The CDN caches the gzipped response and serves it to a client that asked for identity. Fix: always set Vary: Accept-Encoding when the response varies by it; most CDNs add it automatically but origin should not rely on that.
Query string in cache key, but not in business logic. A campaign tracker appends ?utm_source=newsletter. The CDN treats /blog/foo and /blog/foo?utm_source=newsletter as different cache entries. Cache hit rate craters because every variation is uncached. Fix: configure the CDN to ignore irrelevant query parameters in cache keys.
The age problem — Date, Age, and how freshness is calculated
Every cacheable response carries a Date header (when origin generated it) and, once it has been held in a shared cache, an Age header (how many seconds it has been sitting there). The freshness calculation a cache does on each request is straightforward but easy to misread.
response_age = Age header value + (now - Date)
fresh = response_age < max-age
A response with Cache-Control: max-age=600 and Age: 540 is still fresh — for another 60 seconds. After that it becomes stale, and either revalidation kicks in or stale-while-revalidate takes over. When a request reaches the browser through a CDN, the Age header tells you how long the CDN has been holding that copy; a steady stream of responses with Age values jumping back to zero suggests the CDN is regularly evicting or revalidating.
This matters for a non-obvious reason: the browser's freshness countdown starts from Date, not from when the browser received the response. A page that hits the CDN with Cache-Control: max-age=300 and Age: 290 has only 10 seconds of remaining browser freshness — not the full 300 the developer might have expected. For short max-age values behind a long-lived CDN cache, this can mean a browser cache that effectively never fires. The fix is either longer max-age, or using s-maxage for the CDN and a separate (longer) max-age for the browser.
Provider quirks: Cloudflare, Fastly, CloudFront
Same headers, different defaults. A response that caches perfectly on one provider might not cache at all on another.
| Provider | Honours s-maxage by default | Honours no-store | Supports stale-while-revalidate | CDN-specific override |
|---|---|---|---|---|
| Cloudflare | Yes (with Origin Cache Control) | Bypasses cache | Yes, async since Feb 2026 | CDN-Cache-Control, Cloudflare-CDN-Cache-Control |
| Fastly | Yes | Ignored by cache, passed to browser | Yes (native) | Surrogate-Control |
| CloudFront | Bounded by MinTTL/MaxTTL on cache policy | MinTTL can override | Via Lambda@Edge or cache policy | Cache policy + origin policy |
| Bunny CDN | Yes | Yes | Removed RFC 5861 support | Edge rules |
The CDN-Cache-Control header — an IETF draft Cloudflare and Akamai pushed in 2022 — lets origin target CDNs separately from browsers without overloading Cache-Control. A response that wants the CDN to hold an asset for a day but the browser to hold it for a minute can say:
Cache-Control: public, max-age=60
CDN-Cache-Control: public, max-age=86400
CDN-Cache-Control overrides Cache-Control at the CDN layer; the browser sees only Cache-Control. Cloudflare also supports a Cloudflare-only variant (Cloudflare-CDN-Cache-Control) that isn't proxied downstream. Fastly's older Surrogate-Control plays the same role and has been supported since well before the new header existed.
The practical rule: read your provider's caching docs before deploying a new caching strategy. The same header can mean three different things at three providers.
Debugging caching in DevTools
The Network tab is the first place to look. Three columns to add and three patterns to recognise.
Right-click the Network table header and enable Size, Time, and (if not already on) Status. The Size column reads:
(disk cache)— served from the HTTP cache on disk.(memory cache)— served from the in-memory cache, which is held only for the current tab session and is fastest.(prefetch cache)— served from a<link rel="prefetch">hint.(service worker)— served by a service worker; the underlying source is shown separately.- A byte count — the response came over the network.
The Status column shows 200, 304, or (from cache) (which Chrome renders for fully-fresh cache hits with no revalidation).
For deeper inspection, click any request. The Headers tab shows the response's Cache-Control, ETag, Last-Modified, Vary, and Age headers — the last one tells you how many seconds the response has already lived in caches, which is invaluable for working out whether a CDN edge is serving a stale copy.
A few useful DevTools moves:
- Disable cache (a checkbox in the Network panel, only active while DevTools is open) forces every request to origin — useful for testing caching headers without clearing your cache.
- Application → Storage → Cache Storage lets you inspect what a service worker has cached, manually evict entries, and confirm the keys.
- The
x-cacheorcf-cache-statusresponse headers (added by most CDNs) tell you whether the request hit the edge or went to origin:HIT,MISS,REVALIDATED,EXPIRED,STALE,BYPASS. Trust these more than the browser-side cache columns when you're debugging the CDN layer.
For an end-to-end debugging walkthrough that pairs the cache columns with the rest of DevTools, the chrome devtools complete guide covers the panels in detail.
A working caching strategy for a modern web app
Putting all of this together, here is a sensible default for a 2026 web stack with hash-versioned assets, an API behind a CDN, and personalised HTML.
# Hash-versioned static assets — JS, CSS, fonts, images with content hash
Cache-Control: public, max-age=31536000, immutable
# HTML documents
Cache-Control: public, max-age=0, must-revalidate
Vary: Accept-Encoding
# Public, anonymous API responses
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=600, stale-if-error=86400
ETag: "..."
# Logged-in, personalised responses
Cache-Control: private, no-store
# Anything genuinely sensitive (auth tokens, payment flows)
Cache-Control: no-store
Pragma: no-cache
This combination keeps the disk cache useful for assets, keeps HTML revalidated cheaply, lets the CDN absorb most API traffic, protects personalised data from leaking through shared caches, and survives a brief origin outage by serving stale on error.
For a refresher on what's happening before the cache even gets a chance to act, how a CDN works under the hood covers the edge layer in more depth.
FAQ
What is the difference between no-cache and no-store?
no-store means do not write the response to any cache. Every request goes back to origin. no-cache means you can store the response, but must revalidate with origin before serving it again — usually with a conditional If-None-Match request. Use no-store for sensitive data; use no-cache for documents that change often but where conditional requests are cheap.
Does HTTP/3 change how caching works?
No. HTTP caching is defined by RFC 9111 and is protocol-version agnostic. HTTP/3 changes the transport (QUIC over UDP, QPACK header compression) but Cache-Control directives behave identically on HTTP/1.1, HTTP/2, and HTTP/3. HTTP/3 makes cache delivery faster on lossy networks; it does not change cache policy.
What does stale-while-revalidate actually do?
It tells the cache that, after the response's max-age expires, the cache may continue to serve the stale copy for the specified number of seconds while it revalidates in the background. The user sees the cached response immediately; the refreshed copy lands in cache for the next request. Supported by all modern browsers since 2020 and across most major CDNs.
Should I use ETag or Last-Modified?
Default to ETag. It is content-derived and works regardless of file-system timestamps, replicas, or rebuilds. Last-Modified is fine when the resource is backed by a file with a trustworthy modification time and you don't need sub-second precision, but it's the older mechanism and ETag is the safer default.
Why is my CDN serving a logged-in page to other users?
Almost always one of two things: the response is Cache-Control: public (or has no Cache-Control and the CDN defaults to caching), and either the Vary header doesn't include Cookie or Authorization, or the CDN ignores Vary on those fields. Set Cache-Control: private, no-store on any response keyed to a user, or — if you genuinely want to cache personalised responses — configure the CDN cache key to include the user identity.
Can I cache responses to authenticated requests?
Yes, but only if the response is marked Cache-Control: public (or s-maxage with private cleared) and the cache key separates users — typically by including the auth header in the cache key, or by using Vary: Authorization. The default in RFC 9111 is that authenticated responses are not stored by shared caches; public opts in. Be careful — this is the bug surface where logged-in pages leak across users.
Why does my browser keep using a stale file even after I changed it?
max-age hasn't expired yet, or you used immutable and the URL hasn't changed. The fix during development is the Disable cache toggle in DevTools. The fix in production is content-hashed filenames — the URL changes when the content changes, and the browser fetches the new file because it has never seen that URL before.
Where Crosscheck fits
Caching bugs are notoriously hard to report from the user side. By the time the user can reproduce the issue, the cache has often turned over and the broken state is gone — leaving the engineer to guess at what the network actually returned. Crosscheck captures the full network log of a session, including response headers and cache-status fields, alongside a screen recording and console output. Whoever debugs the ticket sees the exact Cache-Control, Vary, and cf-cache-status headers the user got, which is usually enough to diagnose a cache bug in minutes rather than hours.
The Crosscheck team has also written about debugging web applications step by step and Chrome DevTools performance auditing, both of which sit alongside this one in the developer toolkit.



