31 KiB
Specification: Integrate @axe-core/playwright for Automated Accessibility Testing
Issue: #929 Status: Draft Created: 2026-04-20
1. Introduction
Overview
Integrate @axe-core/playwright into the existing Playwright E2E test suite to provide automated WCAG 2.2 Level AA accessibility scanning across all key application pages. The scans will run as part of CI, failing on critical/serious violations.
Objectives
- Install and configure
@axe-core/playwrightas a dev dependency - Create a shared accessibility fixture and helper module
- Add dedicated a11y spec files covering all primary application pages
- Configure axe rules targeting WCAG 2.2 Level AA conformance
- Fail CI on critical/serious violations while allowing a baseline for known issues
- Surface results in Playwright HTML reports across all three browser projects
2. Research Findings
2.1 Existing Playwright Configuration
File: playwright.config.js
| Setting | Value |
|---|---|
testDir |
./tests |
timeout |
60s (CI) / 90s (local) |
workers |
1 (CI) / auto (local) |
retries |
2 (CI) / 0 (local) |
fullyParallel |
true |
reporter |
github (CI) + html + optional coverage |
baseURL |
http://127.0.0.1:8080 (Docker) or http://localhost:5173 (coverage/Vite) |
Projects (6 defined):
| Project | Role | Dependencies |
|---|---|---|
setup |
Authentication (auth.setup.ts) | None |
security-shard-setup |
Security shard init | setup |
security-tests |
Security enforcement (Chromium-only, serial) | setup, security-shard-setup |
security-teardown |
Disable security modules | Conditionally active |
chromium |
Non-security tests | setup (+ security-tests when enabled) |
firefox |
Non-security tests | setup (+ security-tests when enabled) |
webkit |
Non-security tests | setup (+ security-tests when enabled) |
Key patterns:
- Auth state stored at
playwright/.auth/user.jsonviaSTORAGE_STATE - Coverage via
@bgotink/playwright-coveragebehindPLAYWRIGHT_COVERAGE=1 - Global setup in
tests/global-setup.ts(health check, cleanup) - All browser projects share
testMatch: /.*\.spec\.(ts|js)$/withtestIgnorefor security dirs
2.2 Existing E2E Test Files (93 spec files)
Core tests (tests/core/):
| File | Description |
|---|---|
dashboard.spec.ts |
Dashboard loading, summary cards, quick actions |
proxy-hosts.spec.ts |
Proxy host CRUD operations |
navigation.spec.ts |
Menu items, sidebar, breadcrumbs, keyboard nav |
certificates.spec.ts |
Certificate management |
multi-component-workflows.spec.ts |
Cross-feature workflows |
data-consistency.spec.ts |
Data integrity checks |
authentication.spec.ts |
Login/logout, session management |
caddy-import/*.spec.ts |
Caddy config import (5 files) |
admin-onboarding.spec.ts |
First-run setup flow |
domain-dns-management.spec.ts |
Domain and DNS management |
Settings tests (tests/settings/): 10 spec files covering account settings, SMTP, notifications (Pushover, Ntfy, Telegram, Email, Slack), user lifecycle, user management.
Security tests (tests/security/, tests/security-enforcement/): 28+ spec files covering CrowdSec, WAF, ACL, rate limiting, audit logs, encryption, RBAC, emergency operations.
Monitoring tests (tests/monitoring/): uptime-monitoring.spec.ts, create-monitor.spec.ts.
Integration tests (tests/integration/): 6 spec files covering import flows, proxy-DNS integration, proxy-certificates, backups.
Task tests (tests/tasks/): Backups create/restore, Caddyfile import, logs viewing, long-running operations.
Other root-level tests: dns-provider-crud.spec.ts, dns-provider-types.spec.ts, manual-dns-provider.spec.ts, certificate-*.spec.ts, crowdsec-whitelist.spec.ts, modal-dropdown-triage.spec.ts.
2.3 Test Fixtures and Helpers
| File | Purpose |
|---|---|
tests/fixtures/test.ts |
Base test/expect re-export with conditional coverage instrumentation |
tests/fixtures/auth-fixtures.ts |
Extended fixtures: adminUser, regularUser, guestUser, testData (TestDataManager) |
tests/fixtures/certificates.ts |
Certificate-specific fixtures |
tests/fixtures/proxy-hosts.ts |
Proxy host fixtures |
tests/fixtures/security.ts |
Security test fixtures |
tests/fixtures/settings.ts |
Settings fixtures |
tests/fixtures/network.ts |
Network fixtures |
tests/fixtures/notifications.ts |
Notification test fixtures |
tests/fixtures/encryption.ts |
Encryption fixtures |
tests/fixtures/access-lists.ts |
ACL fixtures |
tests/fixtures/dns-providers.ts |
DNS provider fixtures |
tests/fixtures/test-data.ts |
Shared test data |
tests/utils/wait-helpers.ts |
waitForLoadingComplete, waitForTableLoad |
tests/utils/ui-helpers.ts |
UI interaction helpers |
tests/utils/api-helpers.ts |
API request helpers |
tests/utils/TestDataManager.ts |
Test data lifecycle management |
tests/constants.ts |
Shared constants (STORAGE_STATE) |
Import pattern: Most tests import from ../fixtures/auth-fixtures which re-exports test/expect from ./test.ts (coverage-aware).
2.4 Frontend Routes (All Navigable Pages)
Extracted from frontend/src/App.tsx:
| Route | Page Component | Auth Required | Role |
|---|---|---|---|
/login |
Login | No | — |
/setup |
Setup | No | — |
/accept-invite |
AcceptInvite | No | — |
/passthrough |
PassthroughLanding | Yes | Any |
/ |
Dashboard | Yes | Any |
/proxy-hosts |
ProxyHosts | Yes | Any |
/remote-servers |
RemoteServers | Yes | Any |
/domains |
Domains | Yes | Any |
/certificates |
Certificates | Yes | Any |
/dns/providers |
DNSProviders | Yes | Any |
/dns/plugins |
Plugins | Yes | Any |
/security |
Security | Yes | Any |
/security/audit-logs |
AuditLogs | Yes | Any |
/security/access-lists |
AccessLists | Yes | Any |
/security/crowdsec |
CrowdSecConfig | Yes | Any |
/security/rate-limiting |
RateLimiting | Yes | Any |
/security/waf |
WafConfig | Yes | Any |
/security/headers |
SecurityHeaders | Yes | Any |
/security/encryption |
EncryptionManagement | Yes | Any |
/access-lists |
AccessLists | Yes | Any |
/uptime |
Uptime | Yes | Any |
/settings |
Settings > SystemSettings | Yes | admin/user |
/settings/system |
SystemSettings | Yes | admin/user |
/settings/notifications |
Notifications | Yes | admin/user |
/settings/smtp |
SMTPSettings | Yes | admin/user |
/settings/users |
UsersPage | Yes | admin |
/tasks |
Tasks > Backups | Yes | Any |
/tasks/backups |
Backups | Yes | Any |
/tasks/logs |
Logs | Yes | Any |
/tasks/import/caddyfile |
ImportCaddy | Yes | Any |
/tasks/import/crowdsec |
ImportCrowdSec | Yes | Any |
/tasks/import/npm |
ImportNPM | Yes | Any |
/tasks/import/json |
ImportJSON | Yes | Any |
Total unique pages to scan: ~30 authenticated + 3 unauthenticated = ~33 pages.
2.5 CI Workflows
Primary workflow: .github/workflows/e2e-tests-split.yml
Architecture (15 total jobs):
- Build: Single job builds Docker image, uploads as artifact
- 3 Security Enforcement jobs (1 per browser, serial, 60min timeout) — runs
tests/security-enforcement/,tests/security/,tests/integration/multi-feature-workflows.spec.ts - 12 Non-Security jobs (4 shards x 3 browsers, parallel, 60min timeout) — runs
tests/core,tests/dns-provider-*.spec.ts,tests/integration,tests/manual-dns-provider.spec.ts,tests/monitoring,tests/settings,tests/tasks
Triggered by: workflow_call, workflow_dispatch, pull_request.
Non-security test directories explicitly listed in each browser job's Playwright invocation:
tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts
tests/integration tests/manual-dns-provider.spec.ts tests/monitoring
tests/settings tests/tasks
2.6 Package Configuration
Current devDependencies (from package.json):
@playwright/test:^1.59.1@bgotink/playwright-coverage:^0.3.2dotenv:^17.4.2typescript:^6.0.3vite:^8.0.9vitest:^4.1.4
@axe-core/playwright is not yet installed. Latest version: 4.11.2.
2.7 Lefthook Configuration
Pre-commit hooks run in parallel: file hygiene, YAML check, shellcheck, actionlint, Go lint, frontend type-check, frontend lint, semgrep. No Playwright-related hooks. No changes needed for this feature.
3. Technical Specifications
3.1 Architecture Decision: Dedicated A11y Spec Files
Decision: Create dedicated accessibility spec files in a new tests/a11y/ directory rather than embedding axe scans into existing spec files.
Rationale:
| Approach | Pros | Cons |
|---|---|---|
| Dedicated specs (chosen) | Clean separation of concerns; a11y failures don't mask functional failures; can be sharded independently; easy to skip/focus; clear ownership | Slight duplication of page navigation |
| Embedded in existing specs | No navigation duplication; tests a11y in real user flows | Mixes functional and a11y failures; harder to triage; slows all tests; harder to baseline/skip |
The dedicated approach is preferred because:
- A11y violations may be numerous initially and need a baseline — mixing them with functional tests would cause noise
- Independent sharding means a11y tests don't slow down existing functional test shards
- Clearer CI reporting: a11y failures are immediately identifiable in workflow job names
3.2 Shared Accessibility Fixture
File: tests/fixtures/a11y.ts
// Signature (not implementation)
import { test as base } from './auth-fixtures';
import AxeBuilder from '@axe-core/playwright';
interface A11yFixtures {
makeAxeBuilder: () => AxeBuilder;
}
export const test = base.extend<A11yFixtures>({
makeAxeBuilder: async ({ page }, use) => {
const makeAxeBuilder = () =>
new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag22aa'])
.exclude('.chartjs-canvas'); // Exclude known third-party canvases
await use(makeAxeBuilder);
},
});
export { expect } from './auth-fixtures';
Note
: The
.chartjs-canvasselector is a placeholder. Verify against the actual DOM before implementation. A more robust approach may be to targetcanvaselements within chart container elements (e.g.,.chart-container canvas).
Key design points:
- Extends
auth-fixturesto inheritadminUser,regularUser,guestUser, andtestDatafixtures through the full extension chain (auth-fixtures→test.ts→ coverage-aware base) - Factory function (
makeAxeBuilder) allows per-test customization (.exclude(),.disableRules()) - WCAG tags:
wcag2a(Level A),wcag2aa(Level AA),wcag22aa(WCAG 2.2 AA-specific rules) - Global exclusions for known third-party elements that can't be fixed upstream
3.3 A11y Helper Module
File: tests/utils/a11y-helpers.ts
// Signature (not implementation)
import type { AxeResults, Result } from 'axe-core';
type ViolationImpact = 'critical' | 'serious' | 'moderate' | 'minor';
interface A11yAssertionOptions {
/** Impacts to fail on. Default: ['critical', 'serious'] */
failOn?: ViolationImpact[];
/** Known violations to skip (rule IDs) */
knownViolations?: string[];
}
/**
* Filters axe results and returns only violations matching the fail criteria.
* Formats violations for readable Playwright HTML report output.
*/
export function getFailingViolations(
results: AxeResults,
options?: A11yAssertionOptions
): Result[];
/**
* Formats a violation for human-readable output in test reports.
*/
export function formatViolation(violation: Result): string;
/**
* Standard assertion: expect zero critical/serious violations.
*/
export function expectNoA11yViolations(
results: AxeResults,
options?: A11yAssertionOptions
): void;
3.4 Known Violations Baseline
File: tests/a11y/a11y-baseline.ts
A centralized baseline of known violations that should not block CI. This enables gradual remediation.
// Signature (not implementation)
interface BaselineEntry {
ruleId: string;
pages: string[]; // Route patterns where this rule is expected to fail
reason: string; // Why this is baselined
ticket?: string; // Tracking issue for remediation
expiresAt?: string; // ISO date for periodic review (e.g., '2026-07-01')
}
export const A11Y_BASELINE: BaselineEntry[];
Baseline review process: Baseline entries should be periodically reviewed. Use the optional
expiresAtfield to flag entries for re-evaluation. Entries past their expiration date should be investigated and either remediated or renewed with justification.
3.5 axe-core Configuration
| Setting | Value | Rationale |
|---|---|---|
withTags |
['wcag2a', 'wcag2aa', 'wcag22aa'] |
Targets WCAG 2.2 Level AA conformance per project a11y instructions |
| Fail threshold | critical + serious impacts |
Blocks CI on high-impact violations only |
moderate / minor |
Reported but non-blocking | Allows gradual improvement |
| Global excludes | Third-party canvases (Chart.js), Toaster containers | Cannot be fixed in application code |
3.6 Reporter Integration
Axe results will be surfaced in the Playwright HTML report via:
test.info().attach(): Attach violation details as JSON artifacts to each test- Formatted assertion messages:
expect(failingViolations).toEqual([])with a descriptive message showing rule ID, impact, affected nodes, and fix suggestions - Traces: Standard
on-first-retrytrace capture applies to a11y tests too
3.7 Pages to Scan
Priority Tier 1 (most user-facing, first commit):
| Route | Description |
|---|---|
/login |
Login page (unauthenticated — requires storageState: { cookies: [], origins: [] } to prevent redirect) |
/ |
Dashboard |
/proxy-hosts |
Proxy host management |
/certificates |
Certificate management |
/dns/providers |
DNS provider management |
/settings |
System settings |
/settings/users |
User management |
Priority Tier 2 (second commit):
| Route | Description |
|---|---|
/security |
Security dashboard |
/security/access-lists |
Access list management |
/security/crowdsec |
CrowdSec configuration |
/security/waf |
WAF configuration |
/security/rate-limiting |
Rate limiting |
/security/headers |
Security headers |
/security/encryption |
Encryption management |
/security/audit-logs |
Audit logs |
/uptime |
Uptime monitoring |
Priority Tier 3 (third commit):
| Route | Description |
|---|---|
/tasks/backups |
Backup management |
/tasks/logs |
Log viewer |
/tasks/import/caddyfile |
Caddyfile import |
/tasks/import/crowdsec |
CrowdSec import |
/tasks/import/npm |
NPM import |
/tasks/import/json |
JSON import |
/domains |
Domain management |
/remote-servers |
Remote server management |
/settings/notifications |
Notification settings |
/settings/smtp |
SMTP configuration |
/setup |
Initial setup page (unauthenticated — requires storageState: { cookies: [], origins: [] }) |
4. Implementation Plan
Phase 1: Infrastructure Setup
Commit 1: Install dependency and create shared fixtures/helpers
Files created/modified:
| File | Action | Description |
|---|---|---|
package.json |
Modified | Add @axe-core/playwright to devDependencies |
package-lock.json |
Modified | Lockfile update |
tests/fixtures/a11y.ts |
Created | Shared a11y test fixture with makeAxeBuilder factory |
tests/utils/a11y-helpers.ts |
Created | getFailingViolations(), formatViolation(), expectNoA11yViolations() |
tests/a11y/a11y-baseline.ts |
Created | Empty baseline array (initial state — no known violations) |
Validation gate: npm ci succeeds; npx tsc --noEmit passes on new files; imports resolve correctly.
Phase 2: Tier 1 A11y Specs
Commit 2: Add accessibility tests for Tier 1 pages (login, dashboard, proxy-hosts, certificates, dns, settings, users)
Files created:
| File | Description |
|---|---|
tests/a11y/login.a11y.spec.ts |
Scans /login (unauthenticated — uses test.use({ storageState: { cookies: [], origins: [] } })) |
tests/a11y/dashboard.a11y.spec.ts |
Scans / |
tests/a11y/proxy-hosts.a11y.spec.ts |
Scans /proxy-hosts |
tests/a11y/certificates.a11y.spec.ts |
Scans /certificates |
tests/a11y/dns-providers.a11y.spec.ts |
Scans /dns/providers |
tests/a11y/settings.a11y.spec.ts |
Scans /settings and /settings/users |
Test structure (each authenticated spec file follows this pattern):
Authenticated pages rely on the stored auth state from storageState: STORAGE_STATE (configured in playwright.config.js via the setup project), matching the pattern used by all existing non-security tests. No manual loginUser() call is needed.
import { test, expect } from '../fixtures/a11y';
import { waitForLoadingComplete } from '../utils/wait-helpers';
import { expectNoA11yViolations } from '../utils/a11y-helpers';
test.describe('Accessibility: Dashboard', () => {
test.describe.configure({ mode: 'parallel' });
test.beforeEach(async ({ page }) => {
await page.goto('/');
await waitForLoadingComplete(page);
});
test('dashboard has no critical a11y violations', async ({ page, makeAxeBuilder }) => {
const results = await makeAxeBuilder().analyze();
test.info().attach('a11y-results', {
body: JSON.stringify(results.violations, null, 2),
contentType: 'application/json',
});
expectNoA11yViolations(results);
});
});
Unauthenticated page pattern (/login, /setup, /accept-invite):
import { test, expect } from '../fixtures/a11y';
import { waitForLoadingComplete } from '../utils/wait-helpers';
import { expectNoA11yViolations } from '../utils/a11y-helpers';
// Clear stored auth state to prevent redirect to dashboard
test.use({ storageState: { cookies: [], origins: [] } });
test.describe('Accessibility: Login', () => {
test.describe.configure({ mode: 'parallel' });
test('login page has no critical a11y violations', async ({ page, makeAxeBuilder }) => {
await page.goto('/login');
await waitForLoadingComplete(page);
const results = await makeAxeBuilder().analyze();
test.info().attach('a11y-results', {
body: JSON.stringify(results.violations, null, 2),
contentType: 'application/json',
});
expectNoA11yViolations(results);
});
});
Parallel mode: Each a11y test is an independent page scan with no shared state, so
test.describe.configure({ mode: 'parallel' })should be used in all a11y describe blocks to maximize throughput.
**Validation gate**: Tests run locally against Docker container. All pass or fail with only baseline-allowed violations. Verify HTML report contains a11y result attachments.
### Phase 3: Tier 2 A11y Specs
**Commit 3**: Add accessibility tests for security and monitoring pages
**Files created**:
| File | Description |
|------|-------------|
| `tests/a11y/security.a11y.spec.ts` | Scans `/security`, `/security/access-lists`, `/security/crowdsec`, `/security/waf`, `/security/rate-limiting`, `/security/headers`, `/security/encryption`, `/security/audit-logs` |
| `tests/a11y/uptime.a11y.spec.ts` | Scans `/uptime` |
**Validation gate**: All new tests pass locally. Verify cross-browser with `--project=firefox --project=chromium --project=webkit`.
### Phase 4: Tier 3 A11y Specs
**Commit 4**: Add accessibility tests for tasks, domains, remote servers, notifications, SMTP, setup pages
**Files created**:
| File | Description |
|------|-------------|
| `tests/a11y/tasks.a11y.spec.ts` | Scans `/tasks/backups`, `/tasks/logs`, `/tasks/import/caddyfile`, `/tasks/import/crowdsec`, `/tasks/import/npm`, `/tasks/import/json` |
| `tests/a11y/domains.a11y.spec.ts` | Scans `/domains`, `/remote-servers` |
| `tests/a11y/notifications.a11y.spec.ts` | Scans `/settings/notifications`, `/settings/smtp` |
| `tests/a11y/setup.a11y.spec.ts` | Scans `/setup` (unauthenticated — uses `test.use({ storageState: { cookies: [], origins: [] } })`; requires fresh state or skips if already set up) |
**Validation gate**: Full local run with all 3 browsers.
### Phase 5: CI Integration
**Commit 5**: Add `tests/a11y/` to CI workflow non-security shard test paths
**Files modified**:
| File | Change |
|------|--------|
| `.github/workflows/e2e-tests-split.yml` | Add `tests/a11y` to the non-security test directory list in all three browser jobs (`e2e-chromium`, `e2e-firefox`, `e2e-webkit`) |
The change in each browser job's Playwright invocation adds `tests/a11y` to the directory list:
```bash
# Before
npx playwright test \
--project=chromium \
--shard=${{ matrix.shard }}/${{ matrix.total-shards }} \
tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts \
tests/integration tests/manual-dns-provider.spec.ts tests/monitoring \
tests/settings tests/tasks
# After
npx playwright test \
--project=chromium \
--shard=${{ matrix.shard }}/${{ matrix.total-shards }} \
tests/a11y tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts \
tests/integration tests/manual-dns-provider.spec.ts tests/monitoring \
tests/settings tests/tasks
Validation gate: Push to a feature branch. Verify all 15 CI jobs pass (or fail only on genuine a11y issues). Verify a11y tests appear in uploaded Playwright HTML report artifacts.
Shard timing monitoring: After rollout, monitor shard execution times across the 12 non-security jobs. If a11y tests create significant imbalance (one shard consistently slower), consider a dedicated a11y CI job with its own sharding. This is a "watch and react" item — no preemptive action needed.
Phase 6: Documentation
Commit 6: Add documentation for the a11y testing setup
Files created/modified:
| File | Action | Description |
|---|---|---|
tests/a11y/README.md |
Created | Documents how to run a11y tests, add new pages, manage the baseline, interpret results |
5. CI Integration Details
5.1 Where A11y Tests Run
A11y tests join the non-security shard jobs. They:
- Run across all 3 browsers (Chromium, Firefox, WebKit)
- Are distributed across the 4 shards per browser via Playwright's
--shardflag - Use the same Docker container (Cerberus OFF)
- Share the same auth setup dependency
5.2 Sharding Impact
Adding ~10 spec files to the non-security pool (currently ~50 spec files sharded 4 ways per browser) increases the per-shard workload by ~20%. Each axe scan takes 2-5 seconds per page, so the total added time per shard is approximately 10-30 seconds — within acceptable tolerance given the 60-minute timeout.
5.3 Failure Behavior
| Impact Level | CI Behavior |
|---|---|
critical |
Fails CI — test assertion fails, shard exits non-zero |
serious |
Fails CI — test assertion fails, shard exits non-zero |
moderate |
Reported only — attached to HTML report as JSON, does not fail |
minor |
Reported only — attached to HTML report as JSON, does not fail |
5.4 Baseline Workflow
When a new genuine violation is discovered that cannot be immediately fixed:
- Create a GitHub issue tracking the remediation
- Add the rule ID + page pattern to
tests/a11y/a11y-baseline.tswith the issue reference - The
expectNoA11yViolations()helper filters out baselined violations - When remediation is complete, remove the baseline entry — CI will now enforce the fix
6. Edge Cases and Considerations
6.1 Performance
- axe-core injects a script (~500KB) into each page; this happens per
analyze()call - Expected overhead per scan: 2-5 seconds
- Total overhead for ~33 pages across 3 browsers: ~5-8 minutes of additional CI time distributed across 12 non-security shards
- Mitigation: Tests use
waitForLoadingComplete()to ensure pages are fully rendered before scanning, avoiding incomplete DOM analysis
6.2 Dynamic Content and Loading States
- All scans MUST wait for loading states to complete (
waitForLoadingComplete()) - Pages with lazy-loaded content (modals, dropdowns) should be scanned in their default state first; modal-specific scans can be added as follow-up
- The
waitForTableLoad()helper should be used for pages with data tables (proxy hosts, certificates, etc.) - Async data pages: Pages that fetch data asynchronously (proxy-hosts, certificates, DNS providers, uptime monitors) should use
waitForTableLoad()or equivalent waits to ensure the data-populated DOM is scanned, not the loading skeleton
6.3 Browser-Specific Behavior
axe-core produces consistent results across browsers because it analyzes the DOM/ARIA tree, not rendered pixels. However:
- WebKit may have minor differences in ARIA attribute support
- Running across all 3 browsers catches rendering-layer a11y issues (e.g., focus visibility) that axe cannot detect
- If a violation appears in one browser but not others, investigate before baselining
6.4 Third-Party Components
| Component | Strategy |
|---|---|
| Chart.js canvases | Exclude via .exclude('.chartjs-canvas') or equivalent selector — canvas elements have inherent a11y limitations |
| React Hot Toast | Exclude toaster container — controlled by library, has built-in ARIA |
| Code editors (if any) | Exclude via selector — third-party code editors have known a11y gaps |
6.5 Gradual Rollout Strategy
To avoid blocking CI with a flood of violations on first merge:
- Commit 1-4: Build the infrastructure and specs. Run locally, observe results.
- Before Commit 5 (CI integration): Populate
a11y-baseline.tswith any critical/serious violations found during local testing. Create tracking issues for each. - Commit 5: CI integration — all tests pass because known violations are baselined.
- Post-merge: Remediate baselined violations one by one. As each is fixed, remove from baseline. CI enforces the fix from that point forward.
7. Files Requiring Review for Updates
| File | Check | Action Required |
|---|---|---|
.gitignore |
No new generated files outside existing patterns | No change needed |
codecov.yml |
a11y tests are Playwright specs, already covered by E2E patterns | No change needed |
.dockerignore |
tests/ is not copied into Docker image |
No change needed |
Dockerfile |
Tests are not part of the Docker build | No change needed |
lefthook.yml |
No pre-commit a11y hooks needed | No change needed |
playwright.config.js |
a11y specs match existing testMatch and testIgnore patterns. The tests/a11y/ directory is NOT in any ignore pattern. |
No change needed |
tsconfig.json (if any) |
Ensure @axe-core/playwright types resolve |
Verify — @axe-core/playwright ships with TypeScript declarations |
8. Commit Slicing Strategy
Approach: Single PR with 6 ordered logical commits.
Trigger reasons: Single feature scope, low cross-domain risk, incremental validation.
Commit 1: Infrastructure — install dependency and create shared fixtures
- Scope: Package installation, fixture creation, helper module, baseline file
- Files:
package.json,package-lock.json,tests/fixtures/a11y.ts,tests/utils/a11y-helpers.ts,tests/a11y/a11y-baseline.ts - Dependencies: None
- Validation:
npm ci,npx tsc --noEmit, import resolution
Commit 2: Tier 1 a11y specs — core pages
- Scope: A11y tests for login, dashboard, proxy-hosts, certificates, DNS, settings
- Files:
tests/a11y/login.a11y.spec.ts,tests/a11y/dashboard.a11y.spec.ts,tests/a11y/proxy-hosts.a11y.spec.ts,tests/a11y/certificates.a11y.spec.ts,tests/a11y/dns-providers.a11y.spec.ts,tests/a11y/settings.a11y.spec.ts - Dependencies: Commit 1
- Validation:
npx playwright test tests/a11y/ --project=firefox
Commit 3: Tier 2 a11y specs — security and monitoring pages
- Scope: A11y tests for security suite and uptime monitoring
- Files:
tests/a11y/security.a11y.spec.ts,tests/a11y/uptime.a11y.spec.ts - Dependencies: Commit 1
- Validation:
npx playwright test tests/a11y/ --project=firefox
Commit 4: Tier 3 a11y specs — tasks, domains, notifications, setup
- Scope: Remaining page coverage
- Files:
tests/a11y/tasks.a11y.spec.ts,tests/a11y/domains.a11y.spec.ts,tests/a11y/notifications.a11y.spec.ts,tests/a11y/setup.a11y.spec.ts - Dependencies: Commit 1
- Validation: Full a11y suite:
npx playwright test tests/a11y/ --project=chromium --project=firefox --project=webkit
Commit 5: CI integration — add a11y tests to workflow
- Scope: Add
tests/a11yto non-security shard test paths in CI workflow - Files:
.github/workflows/e2e-tests-split.yml - Dependencies: Commits 1-4 (all specs must pass first)
- Validation: Push to feature branch; all 15 CI jobs pass
Commit 6: Documentation
- Scope: README for a11y test directory
- Files:
tests/a11y/README.md - Dependencies: Commits 1-5
- Validation: Markdown lint passes
Rollback
If the PR causes CI instability post-merge:
- Immediate: Revert Commit 5 only (removes
tests/a11yfrom CI paths) — a11y tests still exist but don't run in CI - Investigation: Run a11y tests locally to identify flaky or environment-dependent failures
- Resolution: Fix failures, re-add to CI
9. Acceptance Criteria
| # | Criterion | Verification |
|---|---|---|
| 1 | @axe-core/playwright installed as devDependency |
npm ls @axe-core/playwright returns version |
| 2 | Shared fixture provides makeAxeBuilder factory |
tests/fixtures/a11y.ts exports correctly |
| 3 | A11y scans cover all ~33 navigable pages | Count tests in tests/a11y/ matches page list |
| 4 | WCAG 2.2 AA tags configured | withTags(['wcag2a', 'wcag2aa', 'wcag22aa']) in fixture |
| 5 | CI fails on critical/serious violations | Inject a test violation, verify CI fails |
| 6 | Results visible in Playwright HTML report | Open report, verify JSON attachments on a11y tests |
| 7 | Works across Chromium, Firefox, WebKit | All 3 browser projects pass in CI |
| 8 | Baseline mechanism for known violations | Baselined violations do not fail CI |
| 9 | CI workflow updated to include tests/a11y |
Verify in .github/workflows/e2e-tests-split.yml |
| 10 | No existing tests broken | All non-a11y CI jobs still pass |
10. Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| High number of initial violations blocks merge | High | Medium | Baseline mechanism (Section 6.5); populate before CI integration |
| axe scan flakiness in CI | Low | Medium | Retries (already configured: 2 in CI); waitForLoadingComplete before scans |
| Performance degradation of CI | Low | Low | ~10-30s additional per shard; well within 60min timeout |
| WebKit axe-core compatibility | Low | Low | axe-core is DOM-based, browser-agnostic; monitor for edge cases |
| Third-party component violations | Medium | Low | Global exclusions in fixture; documented in baseline |