A 2026 Field Guide to Exploratory Testing
Exploratory testing is the practice of designing and executing tests at the same time — using each observation to shape what you investigate next — and it remains one of the highest-yield quality activities a team can run, even as AI generates more of the scripted layer underneath. The discipline was named by Cem Kaner in 1984, formalised into Session-Based Test Management by Jonathan and James Bach in 2000, and is now the human counterweight to AI-written regression suites that catch known bugs but rarely surface the unknown ones.
This guide covers the definition, the SBTM session structure (charter, time-box, debrief), the two heuristics most worth memorising — SFDIPOT and FCC CUTS VIDS — the tooling that actually helps in 2026, and where exploratory work fits alongside AI test generation.
Key takeaways
- Exploratory testing is structured, not ad hoc — charters, timeboxes, and debriefs make it accountable and measurable.
- SBTM is the dominant framework, built by the Bach brothers at HP and first presented at STAR West 2000.
- Two heuristics carry most of the weight: James Bach's SFDIPOT (Structure, Function, Data, Interfaces, Platform, Operations, Time) and Michael D. Kelly's FCC CUTS VIDS touring mnemonic.
- AI test generation has not replaced exploratory testing — it has narrowed the bottleneck to the bugs only a curious human notices.
- The pinch point in 2026 is documentation, not the testing itself. Capturing console, network, and reproduction context is what makes findings actionable.
What is exploratory testing?
Exploratory testing is a style of software testing in which test design, test execution, and learning happen in parallel — the tester uses what they discover in the previous moment to decide what to probe next. Cem Kaner, who coined the term in 1984 and published it in Testing Computer Software (1988), defined it as "a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the quality of his/her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project."
In scripted testing the tester follows a map. In exploratory testing they draw it as they go.
One distinction worth nailing down early — exploratory testing is not ad hoc testing. Ad hoc testing is unplanned, undocumented, and produces no accountable output. Exploratory testing is governed by charters, time-boxes, heuristics, and session reports. It is disciplined and measurable; it just applies discipline differently than scripted approaches do.
The term was inspired by John Tukey's "exploratory data analysis." Kaner introduced it to give skilled unscripted testing a respectable label — ad hoc had stigmatised the practice. The Context-Driven School (Kaner, James Bach, Brian Marick, Bret Pettichord) then spent two decades turning it into a teachable discipline.
Exploratory testing vs. scripted testing
Both approaches serve legitimate purposes. The difference lies in what they optimise for.
Scripted testing is optimised for consistency, repeatability, and compliance. Every tester follows the same steps and produces a verifiable audit trail. It is the right tool when requirements are stable, when regulatory obligations mandate specific test evidence, or when you need automated regression coverage over hundreds of cases.
Exploratory testing is optimised for discovery and information yield per hour. Testers spend zero time upfront writing test cases and can redirect their attention the instant they spot something interesting — an off-pattern API response, a stale cache, a UI state that should not exist.
| Scripted testing | Exploratory testing | |
|---|---|---|
| Planning | Extensive upfront test case design | Charter-based — planned at session level |
| Flexibility | Fixed — deviating requires change control | High — tester adapts in real time |
| Best for | Regression, compliance, stable requirements | New features, complex systems, tight timelines |
| Documentation | Pre-written test cases | Session notes, bug reports, debriefs |
| Skill profile | Accessible to junior testers | Rewards domain knowledge and curiosity |
| AI-augmentation in 2026 | Strong — LLMs generate test scaffolds well | Weak — judgement and curiosity are still human work |
The most effective QA programs use both. PractiTest's State of Testing 2026 report finds exploratory testing is used most heavily by small teams without massive regression suites — startup respondents lean on it more (20.6%) than enterprise respondents (16.9%) as their primary safety net. When you cannot afford to script every path, skilled human exploration becomes the line of defence.
Session-Based Test Management (SBTM)
The framework most teams use to govern exploratory testing is Session-Based Test Management, or SBTM. It was created in 2000 by brothers Jonathan Bach and James Marcus Bach while working at Hewlett-Packard, and first presented publicly by Jon Bach at STAR West 2000 in a talk titled "How to Measure Ad Hoc Testing." SBTM added accountability and metrics to exploratory testing without sacrificing its adaptive nature.
A session, in Jon Bach's words, is "an uninterrupted block of reviewable, chartered test effort." SBTM organises testing into three core elements.
1. The charter
A charter defines the focus of one session. It answers two questions — what are you testing, and what are you looking for? A good charter is specific enough to provide direction but broad enough to allow real discovery.
Elizabeth Hendrickson's Explore It! (Pragmatic Bookshelf) popularised a concise template that many teams adopt verbatim:
- Explore — what feature, workflow, or area are you investigating?
- With — what resources, data, permissions, or tools do you need?
- To discover — what specific risk, outcome, or behaviour are you watching for?
A worked example: Explore the checkout flow with a guest account and an expired payment card to discover whether the error states are handled gracefully and whether retries create duplicate orders. That charter is actionable without being prescriptive — it tells you where to start and what to watch for, while leaving room for the discoveries scripted tests cannot anticipate.
The common failure mode is writing charters that are either too vague ("test the payment module") or too narrow ("verify clicking Cancel on step 3 closes the modal"). The first provides no useful direction. The second is a scripted test case in a charter wrapper.
2. The time-box
Sessions are time-boxed — typically 60 to 90 minutes, though the original SBTM article allows for short (45 minutes), normal (90 minutes), and long (120 minutes) sessions. During that window, the tester commits to uninterrupted exploration. No email, no Slack, no context switching. The constraint creates urgency and sharpens attention.
Teams new to SBTM often start with shorter sessions — 30 to 45 minutes — and lengthen them as testers build stamina. Shorter sessions also work well for investigating a specific, bounded area rather than a broad workflow.
The time-box does something else important — it makes exploratory testing measurable. Session counts, areas covered, and bug densities per session give managers visibility into manual testing effort without requiring exhaustive pre-written test plans.
3. The session report and debrief
At the end of each session, the tester produces a brief report covering what was explored, what bugs were filed, what questions emerged, and what follow-up sessions are warranted. This output is the accountability mechanism that distinguishes SBTM from ad hoc testing.
Jon Bach proposed the PROOF mnemonic for the structure of the post-session debrief:
- Past — what happened during the session?
- Results — what was achieved?
- Obstacles — what got in the way of good testing?
- Outlook — what still needs to be done?
- Feelings — how does the tester feel about all this?
Teams running SBTM at scale hold debriefs of 10 to 15 minutes per session. The debrief is not bureaucracy — it surfaces patterns across sessions, identifies risk areas being missed, and keeps exploratory testing connected to broader project goals.
Heuristics — the testing toolkit you actually carry into a session
The reason skilled exploratory testers do not run out of ideas is that they carry heuristics — mnemonic prompts that guide attention. Two are worth memorising before any session.
SFDIPOT (James Bach's Heuristic Test Strategy Model)
SFDIPOT is part of the Heuristic Test Strategy Model (HTSM), developed by James Bach and refined with Michael Bolton through the Rapid Software Testing workshop. It originally read SFDPOT and was later expanded to add Interfaces explicitly.
- S — Structure — the code, files, components, hardware
- F — Function — what the product does, its features
- D — Data — inputs, outputs, stored data, data flows
- I — Interfaces — APIs, UI, integrations, message buses
- P — Platform — operating systems, browsers, devices, dependencies
- O — Operations — how the product is used, by whom, in what sequences
- T — Time — concurrency, timing, ordering, deadlines
SFDIPOT is not a template or a test plan. James Bach has been explicit about this — it is a generative heuristic, "a way to bring important ideas into your conscious mind while you're testing." When you feel stuck mid-session, walk through the letters and ask which dimension you have not yet probed.
FCC CUTS VIDS (Michael D. Kelly's touring heuristic)
FCC CUTS VIDS was created by Michael D. Kelly, building on James Bach's touring concept. It frames exploration as a series of guided "tours" of the product, each tour focusing attention on a different lens:
- F — Feature tour
- C — Complexity tour
- C — Claims tour (test the marketing claims)
- C — Configuration tour
- U — User tour (impersonate specific personas)
- T — Testability tour
- S — Scenario tour (end-to-end flows)
- V — Variability tour (boundaries, edge cases)
- I — Interoperability tour
- D — Data tour
- S — Structure tour
The two are complementary. SFDIPOT prompts what aspects of the product to consider. FCC CUTS VIDS prompts how to approach the system — through which lens, with which mindset. Michael Bolton's HICCUPPS (consistency oracles) and James Bach's CRUSSPIC STMPL (quality criteria) round out the kit, but SFDIPOT and FCC CUTS VIDS carry most of the day-to-day weight.
Tools that actually help exploratory testers in 2026
The tooling landscape has consolidated around four categories — note-taking, session capture, bug reporting, and AI-assisted heuristics. None of these tools do exploratory testing for you. They reduce the friction around it.
| Category | Representative tools | What they solve |
|---|---|---|
| Session capture | Crosscheck, Jam, Bird Eats Bug | Auto-capture of console, network, video — so context is not lost |
| Note-taking and charter tracking | Rapid Reporter, Xray, qTest Explorer, TestRail Exploratory module | Structured session notes tied to charters |
| Mind-mapping for charters | Xmind, MindMeister | Visual planning of session coverage |
| AI heuristic assistants | ChatGPT, Claude, custom GPTs trained on HTSM | Generating charter ideas, brainstorming test conditions |
A practical 2026 stack for a small team — mind-map test areas at sprint planning, pick three charters per developer-week, run sessions with auto-capture in the background, file bugs straight from the capture tool, and debrief in 15-minute syncs using PROOF.
For teams with stricter compliance needs (medical devices, automotive, finance), a test management platform that maintains a tamper-evident session log — PractiTest, Xray for Jira, or TestRail with the SBTM module — becomes non-optional.
Where exploratory testing fits in 2026, alongside AI test generation
Two things are simultaneously true in 2026. AI now generates a meaningful share of new scripted tests — tools like Playwright with Copilot, Mabl, Testim, and GitHub's experimental test-generation features have made it cheap to scaffold unit and end-to-end suites from natural-language specs or recorded user flows. And exploratory testing has become more important, not less, as a result.
The reason is straightforward. AI-generated tests cover the cases someone described, the cases the model inferred from the codebase, and the cases the recorded user actually took. They do not cover the cases nobody thought of. That is, by definition, the territory exploratory testing exists to investigate.
A few patterns are emerging in the 2026 stack:
- AI handles the regression baseline. Teams use Copilot, Mabl, or their internal LLM workflows to generate Playwright or Cypress tests for known flows, leaving human testers to focus on what those flows missed.
- Exploratory testing inherits the strategic role. The senior QA voice in standup is increasingly the one identifying which areas need human attention — usually new features, complex integrations, accessibility, and security boundaries.
- Charters get drafted with AI assistance. Some teams pipe the sprint's user stories, design diffs, and recent bug history into an LLM and ask it to propose charter candidates. The tester picks, refines, and runs.
- Self-healing locators reduce flakiness in the scripted layer, which buys back the time exploratory testing always needed.
- The bug-reporting bottleneck has overtaken the test-writing bottleneck. Engineering teams across multiple State of Testing surveys report that the friction has shifted — running the test is fast, the slowdown is now in capturing and communicating what broke and how.
That last point is where exploratory testing collides with a practical reality. The moment a tester finds a bug — often in the middle of an unrelated charter — they need to capture context they did not know they would need ten seconds earlier. The console at the time, the network request that failed, the sequence of clicks, the state of the URL. If that context is gone, the bug becomes a Jira ticket nobody can reproduce.
How Crosscheck fits into an exploratory session
Crosscheck is a free Chrome extension built for the bug-reporting moment of an exploratory session. It runs silently in the background while you explore, automatically capturing console logs, network requests, user actions, and a rolling video buffer of the session. When you find a bug, one click assembles screenshot, video, console, network, environment, and reproduction steps into a complete report and sends it to Jira, Linear, ClickUp, Slack, or GitHub.
For exploratory testers specifically, two properties matter. First, you do not have to predict when you will find a bug — the buffer is already running. Second, the session capture serves double duty as raw material for the SBTM session report, reducing the manual write-up at the end of every charter.
Crosscheck is a bug-reporting tool, not a test-management platform. For SBTM session management itself, pair it with Rapid Reporter, the Xray exploratory module in Jira, or PractiTest's exploratory feature.
For a broader view of the testing stack around exploratory work, see Crosscheck's roundups of the best AI testing tools for 2026, the best bug reporting tools for 2026, and how exploratory sessions fit into the 10 SQA methodologies you should know.
Running effective exploratory sessions — practical guidance
A few field-tested rules tend to separate productive sessions from theatrical ones.
Know the system before you explore it. Spend five minutes reviewing recent bug reports, release notes, or user feedback in that area. Domain knowledge sharpens intuition — a tester who has read last sprint's incidents will be more dangerous than one who has not.
Use risk-based prioritisation. Focus exploratory effort where failures hurt most — payment, authentication, data sync, third-party integrations, and any feature with a history of defects. Stack multiple sessions from different angles on the high-risk areas.
Vary your approach across sessions. Different data, starting states, user roles, browser conditions. Bugs that hide under one set of conditions surface under another. The best exploratory testers think adversarially — they hunt for the sequences the system was not designed to handle.
Pair test for complex areas. Two testers exploring together often outperform one tester running twice as long. One operates, the other observes and asks questions. The real-time dialogue triggers hypotheses neither would have formed alone — and it is the fastest way to onboard a new team member to a product area.
File bugs with full context. A session that surfaces ten bugs but produces vague tickets is not a win. Developers need repro steps, the action sequence, any console errors or network failures, and screenshots or video. The more complete the report, the faster the bug ships fixed — and the more credibility exploratory testing earns with engineering.
FAQ
How long should an exploratory testing session be?
The SBTM standard is 60 to 90 minutes, with short sessions of 45 minutes and long sessions of 120 minutes as variants. Teams new to SBTM often start at 30 to 45 minutes and grow into longer ones. The constraint matters more than the exact duration — the time-box prevents drift and creates measurable units of work.
Is exploratory testing the same as ad hoc testing?
No. Ad hoc testing is unplanned and produces no documented output. Exploratory testing is governed by charters, time-boxes, and session reports. It is structured and accountable — it simply structures the work differently than scripted testing does.
Who invented session-based test management?
Jonathan Bach and James Marcus Bach developed SBTM in 2000 while working at Hewlett-Packard. It was first presented publicly by Jon Bach at STAR West 2000.
What is the difference between SFDIPOT and FCC CUTS VIDS?
SFDIPOT, by James Bach, lists product dimensions to consider — Structure, Function, Data, Interfaces, Platform, Operations, Time. FCC CUTS VIDS, by Michael D. Kelly, lists "tours" you can run — feature, complexity, claims, configuration, user, testability, scenario, variability, interoperability, data, structure. SFDIPOT prompts what to think about; FCC CUTS VIDS prompts how to approach it. Most experienced testers use both.
Can AI replace exploratory testing?
No, though it changes the surrounding workflow. AI is excellent at generating scripted tests, drafting charter candidates, and analysing test output. It is not (yet) good at the judgement calls that drive exploratory work — deciding what to investigate next based on a half-noticed anomaly. In 2026, AI handles more of the regression layer, which lets human testers spend more time on exploratory work, not less.
Does exploratory testing produce evidence good enough for compliance audits?
Yes, when run as SBTM. Charter, session report, bug evidence, and debrief notes together produce a reviewable trail. For tightly regulated domains (medical devices, automotive, finance), most teams pair SBTM with a test management platform that timestamps and signs session artefacts.
Start exploring — without losing a single bug
Exploratory testing is one of the highest-value quality activities a team can run, and the case for it gets stronger as AI handles more of the scripted regression layer. Charters, time-boxes, debriefs, and heuristics — SFDIPOT and FCC CUTS VIDS especially — turn what looks like improvisation into a measurable, accountable practice.
The remaining friction is not the testing itself. It is the documentation around it — capturing enough context about what you found, and how you found it, to make the bug report actionable on the developer's side.
Try Crosscheck free and run your next session with automatic capture of console logs, network requests, and user actions. When you find a bug, every detail is already in the report.



