axe vs WAVE vs Pa11y: Which Accessibility Tool Wins in 2026

Written By  Crosscheck Team

Content Team

December 25, 2025 12 minutes

axe vs WAVE vs Pa11y: Which Accessibility Tool Wins in 2026

axe vs WAVE vs Pa11y: Choosing an Accessibility Testing Tool in 2026

axe DevTools, WAVE, and Pa11y are the three most widely used automated accessibility testing tools, and they solve different problems. axe is the framework-native engine for developers who want WCAG checks inside Playwright, Cypress, or Jest. WAVE is the WebAIM browser extension built for in-context visual audits — it powers the annual WebAIM Million report. Pa11y is the open-source CLI that runs accessibility checks against URLs or sitemaps in CI. Most serious teams end up using at least two.

TL;DR

  • axe DevTools (Deque) — best for developers; free extension plus paid Pro; axe-core has been downloaded 4 billion+ times and powers Lighthouse, Pa11y, and most other scanners under the hood.
  • WAVE (WebAIM) — best for designers, content owners, accessibility specialists; visual overlay on the live page; runs locally with no data sent to WebAIM.
  • Pa11y — best for DevOps pipelines; CLI-first, scriptable, free; Pa11y 9.1 (2026) ships axe-core 4.11 and Puppeteer 24.
  • False-positive rates are lowest on axe; WAVE is noisier by design; Pa11y inherits whichever runner you point it at.
  • Automated WCAG 2.2 coverage is intentionally narrow — Deque has stated target-size is likely the only WCAG 2.2 rule axe-core will add, because the rest produce too many false positives without manual review.

Why this comparison matters in 2026

The 2026 WebAIM Million report — published in late March — found 95.9% of the top one million home pages have detectable WCAG 2.2 Level A/AA failures, with an average of 56.1 errors per page. Detected errors rose 10.1% year over year, reversing six years of small improvements. WebAIM attributes the regression to page complexity (1,437 elements per home page, up 22.5% in one year) and a sharp rise in misused ARIA.

The regulatory pressure is no longer hypothetical. The ADA Title II rule took effect April 24, 2026 for state and local government entities serving populations of 50,000+ — they must meet WCAG 2.1 Level AA, with federal penalties up to $150,000 per violation. The European Accessibility Act has been enforced across all 27 EU member states since June 28, 2025; France issued legal notices to four grocery retailers within days, and Germany's BFSG carries fines up to €100,000 per violation. EN 301 549 still references WCAG 2.1, but ISO/IEC 40500:2025 codified WCAG 2.2 in January — most programs are planning against 2.2 to avoid a second remediation cycle.

The question is no longer "should we run an accessibility scanner" but "which one fits where in our delivery pipeline."


axe DevTools: the framework-native option

Made by: Deque Systems Engine: axe-core (open source, MPL 2.0) Distribution: Browser extension (Chrome, Firefox, Edge), npm package, CLI, VS Code Linter Pricing: Free tier; axe DevTools Pro and axe Monitor are paid

axe-core is the accessibility engine that sits underneath most of the tooling ecosystem — Lighthouse uses it, Pa11y can use it, Accessibility Insights uses it. It is a JavaScript library that walks the rendered DOM against rules mapped to WCAG 2.0, 2.1, and 2.2 at levels A, AA, and AAA, plus Section 508, EN 301 549, RGAA, and the ADA tag set.

Where axe earns its reputation

The headline feature is precision. Deque has spent years tuning axe-core to avoid the noise that erodes trust in automated scanners — rules that cannot be evaluated definitively return as incomplete rather than as a flagged error. Deque's published benchmark is that axe-core catches roughly 57% of WCAG issues automatically, above the 30–40% industry baseline.

The other earned reputation is framework integration. Adapters exist for Playwright, Cypress, Selenium, Jest, Puppeteer, and WebdriverIO — adding accessibility assertions to existing end-to-end suites is usually a five-line change:

// Playwright example
import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

test('homepage has no a11y violations', async ({ page }) => {
  await page.goto('/');
  const results = await new AxeBuilder({ page })
    .withTags(['wcag2a', 'wcag2aa', 'wcag22aa'])
    .analyze();
  expect(results.violations).toEqual([]);
});

That snippet ships accessibility regression coverage into the same CI run that already executes the rest of the test suite. No new pipeline, no new dashboard.

The paid axe DevTools Pro tier layers on Intelligent Guided Tests — workflows that walk testers through criteria automation cannot resolve on its own (Focus Appearance, drag alternatives, accessible authentication). Pro also adds the components inspector, exportable reports, and the issue-fix code suggestions that integrate with editors.

Where axe falls short

  • WCAG 2.2 automation is intentionally narrow. Deque has been explicit that target-size (2.5.8) is likely the only new WCAG 2.2 rule axe-core will add — automating the rest produces too many false positives. Focus Appearance and Focus Not Obscured live in Pro's guided tests, not in the engine.
  • The free DevTools panel is technical. A non-developer auditing a marketing page will get lost in the rule names. WAVE is gentler.
  • Pro tier sits behind a license. Expect to make a budget case before adopting axe Monitor or the full Pro suite.

Who axe is for

Engineering teams that already write automated tests. If your pull-request pipeline already runs Playwright or Cypress, adding axe-core is the lowest-friction way to keep accessibility failures from shipping.


WAVE: the visual evaluator built by the people who measure the web

Made by: WebAIM at Utah State University Distribution: Chrome, Firefox, and Edge extensions; web interface at wave.webaim.org; licensed API; Pope Tech for enterprise monitoring Pricing: Free extension and web tool; paid API and enterprise tiers

WAVE has been the community's reference scanner since 2001, and it carries unique weight: the WebAIM Million annual scan is run with the WAVE engine. When you read that 95.9% of home pages fail WCAG, the failures were detected by WAVE.

Where WAVE earns its reputation

WAVE renders findings directly on the live page. Red icons mark errors, green icons highlight existing accessibility features, yellow icons flag alerts that need human judgment. For a content editor looking at a hero section that fails contrast, the icon literally sits on top of the text.

A few things WAVE does that the others do not:

  • Privacy by default. The extension runs entirely in the browser — no DOM, page content, or URL is sent to WebAIM's servers. Safe for intranet pages, staged auth-walled environments, and anything that cannot leave a corporate network.
  • Structure panel. WAVE can disable styles and show the reading and tab order — what a screen reader would actually announce. One of the fastest ways to spot navigation that visually looks fine but reads as nonsense.
  • AIM Score, updated for 2026. Version 3.3.1.0 (May 2026) aligned the AIM Score with the 2026 WebAIM Million methodology.
  • Contrast checker built in. Against WCAG 2.2 thresholds, with foreground alpha opacity support.

Where WAVE falls short

  • No CI integration. WAVE is not scriptable. You cannot fail a build on WAVE output. The licensed WAVE API exists for bulk testing, but it sits outside the free workflow most engineering teams are looking for.
  • More yellow than red. WAVE deliberately surfaces alerts that need human review. Teams new to accessibility sometimes read these as errors and over-count problems.
  • Single-page by default. Site-wide scanning requires the paid API or Pope Tech, the enterprise platform built on top of WAVE.

Who WAVE is for

Designers, content authors, accessibility specialists, and QA engineers running structured manual audits. WAVE is also the strongest teaching tool of the three — the inline explanations turn a scan into a tutorial. If your team is building accessibility literacy from scratch, this is where to start.


Pa11y: the CLI that lives in your pipeline

Made by: The Pa11y open source community Distribution: CLI (pa11y), CI runner (pa11y-ci), dashboard, web service License: Open source (LGPL-3.0) Latest: Pa11y 9.1 (2026), Pa11y CI 4.x

Pa11y — pronounced "pally" — is the headless option. No GUI, no overlay. Point it at a URL, get structured findings with CSS selectors for each failing element, pipe the output wherever your team wants it.

What Pa11y actually is

Pa11y is a runner, not an engine. It uses Puppeteer to drive headless Chrome, then runs the page through one of two pluggable engines: axe-core or HTML_CodeSniffer. The rules come from those engines — Pa11y's job is the orchestration around them.

The current stack as of Pa11y 9.1 (2026): Node.js 20, 22, or 24, Puppeteer 24 (Chrome 135), axe-core 4.11, and a new --level-cap-when-needs-review flag for handling axe's incomplete results.

Where Pa11y earns its reputation

  • CLI-first ergonomics. pa11y https://example.com returns issues in the terminal in seconds. JSON, CSV, HTML, TSV outputs are all flags.
  • Pa11y CI. Point it at a list of URLs or a sitemap, set a threshold for issues allowed before a build fails. Cleanest way to wire accessibility into GitHub Actions, GitLab CI, CircleCI, or Jenkins.
  • Pa11y Dashboard. Self-hosted web UI that graphs accessibility metrics over time — useful for tracking whether a design system migration is regressing component accessibility.
  • Incremental remediation. Threshold control makes Pa11y workable for large legacy codebases: set today's count as the ceiling, then drive it down sprint by sprint.

Where Pa11y falls short

  • No GUI. Non-technical stakeholders cannot use it. Developer tool, full stop.
  • Authenticated pages need scripting. Testing a logged-in dashboard means writing Puppeteer actions for the login flow via Pa11y's actions config.
  • Rule coverage depends on the engine. Pa11y adds no rules of its own. The axe runner gives axe's coverage; HTML_CodeSniffer is somewhat older and noisier.

Who Pa11y is for

DevOps engineers, platform teams, and any organisation that wants accessibility tested on a schedule across many URLs. The natural fit for design system monitoring and site-wide CI gates.


Side-by-side comparison

axe DevToolsWAVEPa11y
MakerDeque SystemsWebAIMPa11y open source
Primary formExtension + npm + CLIExtension + web appCLI + CI runner
Licenseaxe-core MPL 2.0; DevTools proprietaryProprietary (extension free)LGPL-3.0
CostFree tier; paid Pro/MonitorFree extension; paid APIFree
Engineaxe-coreWAVE engineaxe-core or HTML_CodeSniffer
CI/CD integrationFirst-class (Playwright, Cypress, Jest, CLI)None in free tierFirst-class (pa11y-ci)
Visual overlayDevTools panelIn-page icons (best in class)None
Site-wide scanningaxe Monitor (paid)WAVE API / Pope Tech (paid)pa11y-ci (free)
WCAG coverage2.0 / 2.1 / 2.2 (A, AA, AAA)2.2 A/AA, Section 5082.0 / 2.1 / 2.2 via axe runner
Automated WCAG 2.2 rulestarget-size; rest in Pro IGTsYes, where deterministicInherits from engine
False-positive postureLow — incomplete rather than failHigher — surfaces alerts for human reviewInherits from engine
Authenticated pagesYes (browser)Yes (local-only extension)Yes via Puppeteer actions
Learning curveMediumLowMedium–high
Best fitDev + QA in CIAudits, training, design QAAutomated pipelines

False positives in practice

A few patterns hold across teams Crosscheck has talked to:

  • axe under-reports before it over-reports. If axe is silent, do not assume the page is clean — check the incomplete bucket where rules needing human review live. The trade-off is you rarely chase a phantom failure.
  • WAVE over-reports on purpose. Yellow alerts are not WCAG failures; they flag something that needs human judgment (a heading with no associated section, an empty fragment link). The right mental model is that WAVE's red icons are roughly comparable to axe's failures.
  • Pa11y is whatever runner you chose. Pa11y on axe-core 4.11 produces results close to axe DevTools' free tier. Pa11y on HTML_CodeSniffer is noticeably noisier and lags on WCAG 2.2.

For most teams, the practical answer is: axe for gating CI, WAVE for human audits, Pa11y for scheduled site-wide sweeps. A recurring axe failure that also lights up red in WAVE is almost certainly real.


WCAG 2.2 coverage — read the small print

WCAG 2.2 added nine success criteria when published in October 2023. None of the three tools fully automate all of them, and that is by design. Deque's published position is that most of the new criteria — Focus Appearance, Focus Not Obscured, Dragging Movements, Accessible Authentication — cannot be reliably tested without false positives, so they sit in semi-automated guided tests rather than in the engine.

What is actually automated today:

WCAG 2.2 criterionaxe-coreWAVEPa11y (axe)
2.4.11 Focus AppearancePro IGT onlyPartialPro IGT only
2.4.12 Focus Not Obscured (Min)Pro IGT onlyPartialPro IGT only
2.5.7 Dragging MovementsManualManualManual
2.5.8 Target Size (Minimum)Yes (target-size)YesYes
3.2.6 Consistent HelpManualManualManual
3.3.7 Redundant EntryManualManualManual
3.3.8 / 3.3.9 Accessible AuthenticationManualManualManual

Translation: even if every WCAG 2.2 box in your scanner is green, you still need a structured manual pass before claiming conformance. The wider accessibility tooling roundup covers this in more depth — automated scanners reliably catch the same six dominant issues year after year (low contrast, missing alt text, missing form labels, empty links, empty buttons, missing document language), but they cannot adjudicate the nuanced criteria.


CI integration — how each tool fits a pipeline

axe-core inside an existing test framework is the lowest-friction option. The accessibility check runs in the same browser session as a feature test, against the same DOM state, with the same authentication. Output goes into the same JUnit/Playwright report. For teams already running Playwright or Cypress, this is the default in 2026.

Pa11y CI as a dedicated stage is the right shape when accessibility needs to run against many URLs (sitemap-driven) or when there is no application test suite to hang assertions off — a marketing site, a documentation portal, a Storybook deployment. The runner exits non-zero if any URL exceeds its threshold.

WAVE in CI: not really. The free WAVE extension cannot run headless. The licensed WAVE API can, but most teams reaching for an API-based scanner pick axe Monitor or Pa11y CI instead.

The best test automation frameworks for 2026 post covers how accessibility scans fit alongside Playwright, Cypress, and Selenium pipelines.


Which combination should your team actually run?

Honest answer: two of the three, sometimes all three.

A pragmatic 2026 default for product teams: axe-core in the test suite on every pull request, WAVE for structured human audits and design QA, and Pa11y CI on a schedule if there is a large site or design system to monitor.

For accessibility consultancies and in-house compliance programs: axe DevTools Pro for the guided WCAG 2.2 tests automation cannot cover, WAVE plus real screen readers (NVDA, JAWS, VoiceOver) for the manual evidence audit, and Pa11y CI or axe Monitor for the regression baseline after remediation work lands.

What none of these tools do — and what trips up most accessibility programs — is the bit after the issue is found. A keyboard-trap bug in a modal needs more than "modal has focus issue, WCAG 2.4.3 fail" on a ticket. The developer needs the exact sequence of focus moves, which elements were involved, what aria attributes were in flight, what console errors fired. The gap between "scanner caught it" and "developer can reproduce it" is where accessibility tickets stall.


FAQ

Which is better, axe or WAVE?

For developers integrating accessibility into a test suite, axe. For designers and content owners doing in-page audits, WAVE. They are built for different jobs — most programs end up using both. axe-core is also the engine inside Lighthouse, Pa11y's default runner, and Accessibility Insights, so even teams who mostly look at WAVE are usually running axe rules somewhere too.

Is Pa11y still actively maintained?

Yes. Pa11y 9.0 shipped in 2025 with Node 20+ and Puppeteer 24; Pa11y 9.1 followed in 2026 with axe-core 4.11 and a new --level-cap-when-needs-review flag. Pa11y CI 4.x tracks the same release line.

How much of WCAG 2.2 can be tested automatically?

About 30–40% of WCAG failures in general are caught by automated tools, and axe-core specifically claims around 57%. For the nine new criteria in WCAG 2.2, only target-size (2.5.8) is reliably automatable. Focus Appearance, Dragging Movements, Accessible Authentication, and the rest need manual review or semi-automated guided tests.

Can these tools test authenticated pages?

axe DevTools and WAVE both work against any page open in the browser. WAVE keeps page content local, which matters for sensitive environments. Pa11y can test authenticated pages but needs Puppeteer actions configured for the login flow.

Are any of these enough on their own for ADA Title II or EAA compliance?

No. Automated scanners catch a meaningful chunk of WCAG failures but cannot certify conformance. ADA Title II and the EAA both reference WCAG/EN 301 549 conformance, which requires manual testing with assistive technology alongside automated scans. Treat axe, WAVE, and Pa11y as the regression baseline, not the audit.


Where Crosscheck fits — the missing layer

axe, WAVE, and Pa11y find rule violations. They do not solve the next problem: documenting a real accessibility bug — the kind a manual tester catches that the scanner missed — well enough for a developer to actually fix it.

That gap is where most programs lose time. A tester walks through a checkout flow with a screen reader, hits a focus trap in a modal, and now has to reconstruct what step they took, what was on the page, what the console said, what request fired, what aria attribute was set incorrectly. A scribbled paragraph and a screenshot is rarely enough — the developer asks for more context, the tester reproduces it, the cycle eats hours.

Crosscheck is a free Chrome extension built for that exact moment. When a tester hits a bug during manual testing, Crosscheck captures the full context automatically — screen recording, console logs, network requests, browser state, screenshot — and files the report straight into Jira, Linear, ClickUp, GitHub, or Slack. The developer who picks up the ticket has everything needed to reproduce the issue on the first attempt.

For the manual layer that has to sit on top of axe, WAVE, and Pa11y, that reproduction context is the difference between a bug that gets fixed this sprint and one that sits in the backlog because nobody can pin it down. Related: the perfect bug report template and the best bug reporting tools for 2026.


Start documenting accessibility bugs the way developers actually need them

Scanners find rule violations. Humans find the experience failures the scanners cannot reach. Crosscheck is what turns those human findings into reproducible, developer-ready tickets — without the back-and-forth that usually follows.

If your team is preparing for ADA Title II, EAA evidence, or any structured WCAG audit, the tooling stack worth running is axe in CI, WAVE for human audits, Pa11y for scheduled sweeps — and Crosscheck for everything your testers find that the scanners missed.

Try Crosscheck free

Related Articles

Contact us
to find out how this model can streamline your business!
Crosscheck Logo
Crosscheck Logo
Crosscheck Logo

Speed up bug reporting by 50% and
make it twice as effortless.

Overall rating: 5/5