How to Reproduce a Bug That Only Happens Sometimes

Some bugs are straightforward. You click a button, something breaks, you fix it. But intermittent bugs — the ones that appear once, disappear when you reload, and then surface again three days later at 2am — are a different problem entirely. They're frustrating not because they're hard to understand once you see them, but because you can never seem to see them when you need to.

Reproducing an intermittent bug is itself a skill. It requires understanding why these bugs behave the way they do, building a methodology for systematically coaxing them into view, and having the right tools to capture evidence when they do appear — because sometimes a bug will cooperate exactly once before going dormant again.

This guide covers the common causes of intermittent bugs, the techniques that give you the best chance of reproducing them, and how to build a capture workflow so that when a bug does appear, you walk away with everything you need to fix it.

Why Some Bugs Only Happen Sometimes

Before you can reproduce an intermittent bug, it helps to understand the mechanism behind it. Most non-deterministic bugs fall into a handful of categories.

Race Conditions

A race condition occurs when the outcome of a process depends on the sequence or timing of events that aren't guaranteed to happen in a consistent order. In web development, these are everywhere: two async operations both update the same state, a component mounts before an API call resolves, or a WebSocket message arrives while a form is mid-submission.

Race conditions tend to surface more often under load, on slow devices, or when the network is congested — because those conditions change the relative timing of the competing operations. On a fast machine with a local dev server, the race never happens because operation A always finishes well before operation B starts. Deploy to a slower environment, or add a few milliseconds of real-world latency, and suddenly the race becomes losable.

Timing Dependencies

Closely related to race conditions, timing-dependent bugs occur when code makes implicit assumptions about how long something takes. Animations that depend on setTimeout with hardcoded values, debounce thresholds that are too tight for some devices, or event listeners attached after the event has already fired — all of these can create bugs that are present in some environments and absent in others.

A particularly common pattern: a function polls for the existence of a DOM element or a global variable, and on fast machines it's always there by the first check. On a slow device or with a cold cache, it takes slightly longer to initialize, the poll fires before it's ready, and the function fails silently.

Stale State and Caching

Modern web applications maintain a lot of state — in memory, in localStorage, in sessionStorage, in IndexedDB, in service worker caches, in HTTP caches, and in the browser's back/forward cache. When that state gets out of sync with the server, or when a component reads stale values because something upstream didn't trigger a re-render, bugs appear that are tied not to what you did but to what you did previously.

These bugs are maddening to reproduce because the "previous state" that caused them is invisible. A user reports a bug, you open the page fresh, and it works fine — because you don't have the specific combination of cached data and session history that their browser had accumulated.

Network Latency and Request Ordering

HTTP requests don't always resolve in the order they were made. A slow API response, a CDN cache miss, a momentary network hiccup — any of these can cause responses to arrive out of order, leading to the wrong data being displayed or the wrong state being applied.

This is especially common in search-as-you-type implementations where each keystroke fires a request. If request #3 resolves before request #4 — despite being fired earlier — the search results will flicker to an outdated query and then snap back. Users notice this as a "flash" or "glitch"; developers looking at a clean local environment never see it because all four requests resolve in under 10ms.

Environment and Configuration Differences

Some bugs exist only in specific environments. A feature flag that's enabled in production but not staging. A third-party script that behaves differently depending on the user's ad blocker settings. A CSS rule that only applies in Safari. A permission prompt that behaves differently on mobile Chrome. Browser extensions that inject JavaScript or modify DOM events.

These bugs are intermittent from the perspective of a developer testing on one machine, but they're entirely consistent for users with the right (or wrong) configuration.

Data-Dependent Logic

Code that works correctly for most inputs but breaks on specific values — an empty array where a non-empty one is expected, a null where an object should be, a string with special characters, a number at the edge of an expected range — produces bugs that appear only for the subset of users whose data happens to trigger the edge case. Intermittent from a population view; deterministic once you know the right input.

A Systematic Approach to Reproducing Intermittent Bugs

Start With Everything the Reporter Knows

Before trying to reproduce anything, extract maximum information from the person who saw the bug. The questions that matter most:

What exactly happened? Not just "it broke" — what was the visual symptom? An error message? A blank screen? Wrong data? Something missing?
What were they doing immediately before it happened? The triggering action and the two or three steps before it are equally important.
What environment were they in? Browser, OS, device type, network connection type (WiFi, mobile data), browser extensions installed.
Is it reproducible for them? Can they reliably trigger it, or did it happen once? If they can trigger it, ask them to describe the exact sequence.
When did it start? A bug that started after a deployment has a much narrower search space than one that's been happening intermittently for months.
Does it happen to other users? If multiple users report the same symptom, it's probably not a unique local state issue.

The answers shape everything that follows. A bug that's reproducible only in Safari on iOS with a specific data set is a very different investigation from a bug that happens to 5% of users on any platform.

Match the Reporter's Environment as Closely as Possible

If a bug is environment-specific, reproducing it requires matching that environment. This means:

Using the same browser and version (check the exact version, not just Chrome vs. Firefox)
Testing on the same OS if the bug may be OS-specific
Disabling or enabling browser extensions to match what the reporter had
Testing on a real mobile device if the report came from mobile
Connecting on a similar network type if network conditions might be relevant

Browser DevTools can help here. The device emulation mode in Chrome DevTools lets you simulate different screen sizes, pixel densities, and network conditions. The Sensors panel can simulate different geolocation values, orientation, and touch events. These aren't perfect substitutes for real devices, but they get you closer faster.

Simulate Adverse Conditions

Many intermittent bugs are reliable bugs hiding behind favorable conditions. Deliberately worsen those conditions and the bug often becomes consistent.

Throttle the network. In Chrome DevTools, open the Network tab and use the throttling dropdown to switch from No throttling to Slow 3G or a custom profile. Bugs caused by out-of-order responses, race conditions between API calls, or components rendering before data is available will often reproduce immediately under simulated latency.

Throttle the CPU. In the Performance tab, click the gear icon and set CPU throttling to 4x or 6x slowdown. This slows JavaScript execution, which widens the windows where timing-dependent races can occur. Bugs that require a specific ordering of microtasks will often emerge under CPU throttling when they never appear at full speed.

Use the Application tab to manipulate state. Clear localStorage, clear sessionStorage, clear cached data, and unregister service workers selectively to test how the application behaves with different combinations of stored state. A bug caused by a corrupted cache entry may only reproduce when that specific entry exists — deleting everything and rebuilding state step by step can help isolate which piece of state is responsible.

Throttle the server (if you control it). Add artificial delay to specific API endpoints to simulate slow responses. This is especially useful for reproducing race conditions involving multiple API calls that normally resolve in a predictable order.

Reduce the Reproduction Steps to a Minimum

Once you can reproduce an intermittent bug, your next goal is to reduce the number of steps required. The more steps involved, the harder it is to reproduce reliably, and the larger the surface area for the bug to hide in.

Work backward from the symptom. Remove one step at a time and check whether the bug still appears. When removing a step makes the bug disappear, that step is part of the trigger — add it back and try to understand why it matters. Is it setting a specific piece of state? Making a specific network request? Establishing a particular DOM structure?

The minimum reproduction set is valuable for two reasons: it makes the bug easier to reproduce reliably, and it dramatically narrows the scope of what the fix needs to address.

Add Targeted Logging

For bugs you've partially characterized but can't reliably trigger visually, add verbose logging around the suspected area of code. Log function arguments, intermediate state values, timing marks, and the results of async operations. Then run the scenario repeatedly and watch the logs for the specific combination of values that precedes the bug.

console.time() and console.timeEnd() are useful for measuring how long async operations take relative to each other. performance.now() gives sub-millisecond timestamps for precise ordering. For race conditions, logging the order in which async callbacks fire — tagged with timestamps — will often reveal the exact interleaving that produces the bug.

Repeat the Scenario Programmatically

For bugs that appear after many repetitions of an action, manual testing becomes impractical. Automate the repetition. Write a quick script in the browser console that triggers the relevant action hundreds of times in a loop, or use a test runner to replay the scenario at machine speed. Bugs that appear once every fifty manual clicks will appear in seconds when the loop runs automatically.

For UI-level automation, Playwright and Cypress can replay sequences of user interactions. Both have the ability to set viewport size, simulate slow networks, and inject arbitrary state — giving you precise control over the conditions you're testing.

Tools That Help With Intermittent Bug Reproduction

Chrome DevTools

Beyond throttling, Chrome DevTools has several features worth knowing for intermittent bug work:

Breakpoints with conditions: Rather than pausing execution on every call to a function, you can set a conditional breakpoint that only pauses when a specific expression is true. This is invaluable for catching the exact state that precedes a bug without manually stepping through thousands of normal executions.
Logpoints: A logpoint is like a breakpoint that logs a message to the console instead of pausing execution. You get the logging benefit without interrupting the flow — useful for capturing state in code paths that are called frequently.
Event listener breakpoints: In the Sources tab, expand the Event Listener Breakpoints panel to pause execution when specific DOM events fire. If a bug is tied to a click handler or a focus event, you can catch it here without modifying code.
Network request blocking: Right-click any request in the Network tab and select "Block request URL" to simulate what happens when a specific resource fails to load. Useful for testing error handling and fallback behavior.

The Performance Panel

For timing-dependent bugs, the Performance panel's flame chart is the most precise tool available. Record a session, reproduce the bug, and then examine the exact sequence of function calls, tasks, and browser events. You can see exactly when each async operation started, when it resolved, and in what order callbacks fired — down to the microsecond. This level of detail can definitively confirm or rule out a race condition hypothesis.

Crosscheck's Instant Replay

The fundamental challenge with intermittent bugs is that they require you to be in exactly the right state to see them — and that state is usually gone by the time anyone is in a position to investigate. This is where Crosscheck changes the equation entirely.

Crosscheck runs as a browser extension and captures a continuous session buffer in the background. When a bug appears, clicking the Crosscheck button doesn't just take a screenshot of the current moment — it captures a full instant replay of the session leading up to that moment, including the screen recording, every console log, and every network request.

For intermittent bugs, this is transformative. You no longer need to be actively recording when the bug appears. You don't need to have DevTools open. You don't need to remember exactly what you clicked three steps ago. The session buffer captures it all retroactively. The moment you notice something is wrong, you capture — and the full context of how you got there is already recorded.

For QA engineers running exploratory sessions, this means that a bug appearing once during a two-hour session doesn't require stopping, setting up a recording, and trying to reproduce it immediately. You capture when it happens, and Crosscheck hands you the complete picture: what was on screen, what errors appeared in the console, which network requests were in-flight, and the sequence of actions that led there.

For developers receiving bug reports, the instant replay eliminates the back-and-forth that typically follows an intermittent bug report. Instead of asking "can you reproduce it?" and waiting for a response that never comes, they watch the exact session in which it occurred and see every technical detail they need to begin debugging.

When You Can't Reproduce the Bug Yourself

Sometimes, despite your best efforts, you can't reproduce an intermittent bug in a controlled environment. This happens when the bug depends on specific server-side state, a particular combination of user data, a hardware-specific behavior, or a timing window that's genuinely impossible to hit under controlled conditions.

In these cases, the strategy shifts from reproduction to capture:

Get logs from the user's environment. Ask the user to open the browser console, reproduce what they can, and share any errors they see. Better yet, if they're using Crosscheck, have them capture when the bug appears — you'll get the full session context without requiring them to know anything about DevTools.

Add production-safe error logging. Instrument the code paths you suspect with error monitoring (Sentry, Datadog, Rollbar) so that when the bug triggers in production, you capture the full stack trace, the user's state, and the recent event sequence automatically.

Accept probabilistic reproduction. Some bugs have a 5% failure rate under specific conditions. You may not be able to make them happen on demand, but you can make them happen 1 in 20 times by running the scenario repeatedly. Automated test loops, combined with production monitoring, give you coverage even when the bug resists controlled reproduction.

Treat the first reproduction as gold. If a bug that's been elusive finally appears, stop everything and capture every detail immediately. Console logs, network tab state, application state, the exact URL and query parameters, the localStorage contents — all of it. Don't reload to "confirm" the bug unless you've already captured the state. Reloading will clear everything.

Fix the Capture Problem Before You Fix the Bug

The single biggest improvement most teams can make to their intermittent bug workflow isn't a better reproduction technique — it's a better capture workflow. Intermittent bugs by definition don't happen reliably, which means every instance where one does appear is valuable. Wasting that instance by failing to capture the right context is the real problem.

This is exactly what Crosscheck is built for. The instant replay feature means that every bug — intermittent or otherwise — gets documented with a full session recording, console log history, and network request log the moment it's captured. No setup, no "quick, start the recording," no hoping the user can reproduce it again.

If your team spends meaningful time chasing bugs that only happen sometimes, the first tool you should add to your workflow is one that makes sure you never lose the evidence when they do.

Try Crosscheck for free and see how instant replay changes the way you handle the bugs that refuse to cooperate.