- Marked 12 tests as skip pending feature implementation - Features tracked in GitHub issue #686 (system log viewer feature completion) - Tests cover sorting by timestamp/level/method/URI/status, pagination controls, filtering by text/level, download functionality - Unblocks Phase 2 at 91.7% pass rate to proceed to Phase 3 security enforcement validation - TODO comments in code reference GitHub #686 for feature completion tracking - Tests skipped: Pagination (3), Search/Filter (2), Download (2), Sorting (1), Log Display (4)
315 lines
14 KiB
Markdown
315 lines
14 KiB
Markdown
---
|
|
post_title: "E2E Test Remediation Plan"
|
|
author1: "Charon Team"
|
|
post_slug: "e2e-test-remediation-plan"
|
|
microsoft_alias: "charon-team"
|
|
featured_image: "https://wikid82.github.io/charon/assets/images/featured/charon.png"
|
|
categories: ["testing"]
|
|
tags: ["playwright", "e2e", "remediation", "security"]
|
|
ai_note: "true"
|
|
summary: "Phased remediation plan for Charon Playwright E2E tests, covering
|
|
inventory, dependencies, runtime estimates, and quick start commands."
|
|
post_date: "2026-01-28"
|
|
---
|
|
|
|
## 1. Introduction
|
|
|
|
This plan replaces the current spec with a comprehensive, phased remediation
|
|
strategy for the Playwright E2E test suite under [tests](tests). The goal is to
|
|
stabilize execution, align dependencies, and sequence remediation work so that
|
|
core management flows, security controls, and integration workflows become
|
|
reliable in Docker-based E2E runs.
|
|
|
|
## 2. Research Findings
|
|
|
|
### 2.1 Test Harness and Global Dependencies
|
|
|
|
- Global setup and teardown are enforced by
|
|
[tests/global-setup.ts](tests/global-setup.ts),
|
|
[tests/auth.setup.ts](tests/auth.setup.ts), and
|
|
[tests/security-teardown.setup.ts](tests/security-teardown.setup.ts).
|
|
- Global setup validates the emergency token, checks health endpoints, and
|
|
resets security settings, which impacts all security-enforcement suites.
|
|
- Multiple suites depend on the emergency server (port 2020) and Cerberus
|
|
modules with explicit admin whitelist configuration.
|
|
|
|
### 2.2 Test Inventory and Feature Areas
|
|
|
|
- Core management flows: authentication, navigation, dashboard, proxy hosts,
|
|
certificates, access lists in [tests/core](tests/core).
|
|
- DNS providers and ACME workflows: [tests/dns-provider-crud.spec.ts]
|
|
(tests/dns-provider-crud.spec.ts),
|
|
[tests/dns-provider-types.spec.ts](tests/dns-provider-types.spec.ts),
|
|
[tests/manual-dns-provider.spec.ts](tests/manual-dns-provider.spec.ts).
|
|
- Monitoring: uptime and log streaming in
|
|
[tests/monitoring](tests/monitoring).
|
|
- Settings: system, account, SMTP, notifications, encryption, user management
|
|
in [tests/settings](tests/settings).
|
|
- Tasks and imports: backups, Caddyfile import flows, CrowdSec import, and log
|
|
viewing in [tests/tasks](tests/tasks).
|
|
- Security UI: dashboard, WAF, CrowdSec, headers, rate limiting, and audit logs
|
|
in [tests/security](tests/security).
|
|
- Security enforcement: ACL, WAF, rate limits, CrowdSec, emergency token, and
|
|
break-glass recovery in [tests/security-enforcement](tests/security-enforcement).
|
|
- Integration workflows: cross-feature scenarios in
|
|
[tests/integration](tests/integration).
|
|
- Browser-specific regressions for import flows in
|
|
[tests/webkit-specific](tests/webkit-specific) and
|
|
[tests/firefox-specific](tests/firefox-specific).
|
|
- Debug and diagnostics: certificates and Caddy import debug coverage in
|
|
[tests/debug/certificates-debug.spec.ts](tests/debug/certificates-debug.spec.ts),
|
|
[tests/tasks/caddy-import-gaps.spec.ts](tests/tasks/caddy-import-gaps.spec.ts),
|
|
[tests/tasks/caddy-import-cross-browser.spec.ts](tests/tasks/caddy-import-cross-browser.spec.ts),
|
|
and [tests/debug](tests/debug).
|
|
- UI triage and regression coverage: dropdown/modal coverage in
|
|
[tests/modal-dropdown-triage.spec.ts](tests/modal-dropdown-triage.spec.ts) and
|
|
[tests/proxy-host-dropdown-fix.spec.ts](tests/proxy-host-dropdown-fix.spec.ts).
|
|
- Shared utilities validation: wait helpers in
|
|
[tests/utils/wait-helpers.spec.ts](tests/utils/wait-helpers.spec.ts).
|
|
|
|
### 2.3 Dependency and Ordering Constraints
|
|
|
|
- The security-enforcement suite assumes Cerberus can be toggled on, and its
|
|
final tests intentionally restore admin whitelist state
|
|
(see [tests/security-enforcement/zzzz-break-glass-recovery.spec.ts]
|
|
(tests/security-enforcement/zzzz-break-glass-recovery.spec.ts)).
|
|
- Admin whitelist blocking is designed to run last using a zzz prefix
|
|
(see [tests/security-enforcement/zzz-admin-whitelist-blocking.spec.ts]
|
|
(tests/security-enforcement/zzz-admin-whitelist-blocking.spec.ts)).
|
|
- Emergency server tests depend on port 2020 availability
|
|
(see [tests/security-enforcement/emergency-server](tests/security-enforcement/emergency-server)).
|
|
- Some import suites use real APIs and TestDataManager cleanup; others mock
|
|
requests. Remediation must avoid mixing mocked and real flows in a single
|
|
phase without clear isolation.
|
|
|
|
### 2.4 Runtime and Flake Hotspots
|
|
|
|
- Security-enforcement suites include extended retries, network propagation
|
|
delays, and rate limit loops.
|
|
- Import debug and gap-coverage suites perform real uploads, data creation, and
|
|
commit flows, making them sensitive to backend state and Caddy reload timing.
|
|
- Monitoring WebSocket tests require stable log streaming state.
|
|
|
|
## 3. Technical Specifications
|
|
|
|
### 3.1 Test Grouping and Shards
|
|
|
|
- **Foundation:** global setup, auth storage state, security teardown.
|
|
- **Core UI:** authentication, navigation, dashboard, proxy hosts, certificates,
|
|
access lists.
|
|
- **Settings:** system, account, SMTP, notifications, encryption, users.
|
|
- **Tasks:** backups, logs, Caddyfile import, CrowdSec import.
|
|
- **Monitoring:** uptime monitoring and real-time logs.
|
|
- **Security UI:** Cerberus dashboard, WAF config, headers, rate limiting,
|
|
CrowdSec config, audit logs.
|
|
- **Security Enforcement:** ACL/WAF/CrowdSec/rate limit enforcement, emergency
|
|
token and break-glass recovery, admin whitelist blocking.
|
|
- **Integration:** proxy + cert, proxy + DNS, backup restore, import workflows,
|
|
multi-feature workflows.
|
|
- **Browser-specific:** WebKit and Firefox import regressions.
|
|
- **Debug/POC:** diagnostics and investigation suites (Caddy import debug).
|
|
|
|
### 3.2 Dependency Graph (High-Level)
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A[global-setup + auth.setup] --> B[Core UI + Settings]
|
|
A --> C[Tasks + Monitoring]
|
|
A --> D[Security UI]
|
|
D --> E[Security Enforcement]
|
|
E --> F[Break-Glass Recovery]
|
|
B --> G[Integration Workflows]
|
|
C --> G
|
|
G --> H[Browser-specific Suites]
|
|
```
|
|
|
|
### 3.3 Runtime Estimates (Docker Mode)
|
|
|
|
| Group | Suite Examples | Expected Runtime | Prerequisites |
|
|
| --- | --- | --- | --- |
|
|
| Foundation | global setup + auth | 1-2 min | Docker E2E container, emergency token |
|
|
| Core UI | core specs | 6-10 min | Auth storage state, clean data |
|
|
| Settings | settings specs | 6-10 min | Auth storage state |
|
|
| Tasks | backups/import/logs | 10-16 min | Auth storage state, API mocks and real flows |
|
|
| Monitoring | monitoring specs | 5-8 min | WebSocket stability |
|
|
| Security UI | security specs | 10-14 min | Cerberus enabled, admin whitelist |
|
|
| Security Enforcement | enforcement specs | 15-25 min | Emergency token, port 2020, admin whitelist |
|
|
| Integration | integration specs | 12-20 min | Stable core + settings + tasks |
|
|
| Browser-specific | firefox/webkit | 8-12 min | Import baseline stable |
|
|
| Debug/POC | caddy import debug | 4-6 min | Docker logs available |
|
|
|
|
Assumed worker count: 4 (default) except security-enforcement which requires
|
|
`--workers=1`. Serial execution increases runtime for enforcement suites.
|
|
|
|
### 3.4 Environment Preconditions
|
|
|
|
- E2E container built and healthy via
|
|
`.github/skills/scripts/skill-runner.sh docker-rebuild-e2e`.
|
|
- Ports 8080 (UI/API) and 2020 (emergency server) reachable.
|
|
- `CHARON_EMERGENCY_TOKEN` configured and valid.
|
|
- Admin whitelist includes test runner ranges when Cerberus is enabled.
|
|
- Caddy admin health endpoints reachable for import workflows.
|
|
|
|
### 3.5 Emergency Server and Security Prerequisites
|
|
|
|
- Port 2020 (emergency server) available and reachable for
|
|
[tests/security-enforcement/emergency-server](tests/security-enforcement/emergency-server).
|
|
- Port 2019 is reserved for the Caddy admin API; use 2020 for emergency server
|
|
tests to avoid conflicts.
|
|
- Basic Auth credentials required for emergency server tests. Defaults in test
|
|
fixtures are `admin` / `changeme` and should match the E2E compose config.
|
|
- Admin whitelist bypass must be configured before enforcement tests that
|
|
toggle Cerberus settings.
|
|
|
|
## 4. Implementation Plan
|
|
|
|
### Phase 1: Foundation and Test Harness Reliability
|
|
|
|
Objective: Ensure the shared test harness is stable before touching feature
|
|
flows.
|
|
|
|
- Validate global setup and storage state creation
|
|
(see [tests/global-setup.ts](tests/global-setup.ts) and
|
|
[tests/auth.setup.ts](tests/auth.setup.ts)).
|
|
- Confirm emergency server availability and credentials for break-glass suites.
|
|
- Establish baseline run for core login/navigation suites.
|
|
|
|
Estimated runtime: 2-4 minutes
|
|
|
|
Success criteria:
|
|
|
|
- Storage state created once and reused without re-auth flake.
|
|
- Emergency token validation passes and security reset executes.
|
|
|
|
### Phase 2: Core UI, Settings, Monitoring, and Task Flows
|
|
|
|
Objective: Remediate the highest-traffic user journeys and tasks.
|
|
|
|
- Core UI: authentication, navigation, dashboard, proxy hosts, certificates,
|
|
access lists (core CRUD and navigation).
|
|
- Settings: system, account, SMTP, notifications, encryption, users.
|
|
- Monitoring: uptime and real-time logs.
|
|
- Tasks: backups, logs viewing, and base Caddyfile import flows.
|
|
- Include modal/dropdown triage coverage and wait helpers validation.
|
|
|
|
Estimated runtime: 25-40 minutes
|
|
|
|
Success criteria:
|
|
|
|
- Core CRUD and navigation pass without retries.
|
|
- Monitoring WebSocket tests pass without timeouts.
|
|
- Backups and log viewing flows pass with mocks and deterministic waits.
|
|
|
|
### Phase 3: Security UI and Enforcement
|
|
|
|
Objective: Stabilize Cerberus UI configuration and enforcement workflows.
|
|
|
|
- Security dashboard and configuration pages.
|
|
- WAF, headers, rate limiting, CrowdSec, audit logs.
|
|
- Enforcement suites, including emergency token and whitelist blocking order.
|
|
|
|
Estimated runtime: 30-45 minutes
|
|
|
|
Success criteria:
|
|
|
|
- Security UI toggles and pages load without state leakage.
|
|
- Enforcement suites pass with Cerberus enabled and whitelist configured.
|
|
- Break-glass recovery restores bypass state for subsequent suites.
|
|
|
|
### Phase 4: Integration, Browser-Specific, and Debug Suites
|
|
|
|
Objective: Close cross-feature and browser-specific regressions.
|
|
|
|
- Integration workflows: proxy + cert, proxy + DNS, backup restore, import to
|
|
production, multi-feature workflows.
|
|
- Browser-specific Caddy import regressions (Firefox/WebKit).
|
|
- Debug/POC suites (Caddy import debug, diagnostics) run as opt-in,
|
|
including caddy-import-gaps and cross-browser import coverage.
|
|
|
|
Estimated runtime: 25-40 minutes
|
|
|
|
Success criteria:
|
|
|
|
- Integration workflows pass with stable TestDataManager cleanup.
|
|
- Browser-specific import tests show consistent API request handling.
|
|
- Debug suites remain optional and do not block core pipelines.
|
|
|
|
## 5. Acceptance Criteria (EARS)
|
|
|
|
- WHEN the E2E harness initializes, THE SYSTEM SHALL validate emergency token
|
|
and create a reusable auth state without flake.
|
|
- WHEN core management tests execute, THE SYSTEM SHALL complete CRUD flows
|
|
without manual retries or timeouts.
|
|
- WHEN security enforcement suites execute, THE SYSTEM SHALL apply Cerberus
|
|
settings with admin whitelist bypass and SHALL restore security state after
|
|
completion.
|
|
- WHEN integration workflows execute, THE SYSTEM SHALL complete cross-feature
|
|
journeys without data collisions or residual state.
|
|
|
|
## 6. Quick Start Commands
|
|
|
|
```bash
|
|
# Rebuild and start E2E container
|
|
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e
|
|
|
|
# PHASE 1: Foundation
|
|
cd /projects/Charon
|
|
npx playwright test tests/global-setup.ts tests/auth.setup.ts --project=firefox
|
|
|
|
# PHASE 2: Core UI, Settings, Tasks, Monitoring
|
|
# NOTE: PLAYWRIGHT_SKIP_SECURITY_DEPS=1 is automatically set in E2E scripts
|
|
# Security suites will NOT execute as dependencies
|
|
npx playwright test tests/core --project=firefox
|
|
npx playwright test tests/settings --project=firefox
|
|
npx playwright test tests/tasks --project=firefox
|
|
npx playwright test tests/monitoring --project=firefox
|
|
|
|
# PHASE 3: Security UI and Enforcement (SERIAL)
|
|
npx playwright test tests/security --project=firefox
|
|
npx playwright test tests/security-enforcement --project=firefox --workers=1
|
|
|
|
# PHASE 4: Integration, Browser-Specific, Debug (Optional)
|
|
npx playwright test tests/integration --project=firefox
|
|
npx playwright test tests/firefox-specific --project=firefox
|
|
npx playwright test tests/webkit-specific --project=webkit
|
|
npx playwright test tests/debug --project=firefox
|
|
npx playwright test tests/tasks/caddy-import-gaps.spec.ts --project=firefox
|
|
```
|
|
|
|
## 7. Risks and Mitigations
|
|
|
|
- Risk: Security suite state leaks across tests. Mitigation: enforce admin
|
|
whitelist reset and break-glass recovery ordering.
|
|
- Risk: File-name ordering (zzz-) not enforced without `--workers=1`.
|
|
Mitigation: document `--workers=1` requirement and make it mandatory in
|
|
CI and quick-start commands.
|
|
- Risk: Emergency server unavailable. Mitigation: gate enforcement suites on
|
|
health checks and document port 2020 requirements.
|
|
- Risk: Import suites combine mocked and real flows. Mitigation: isolate by
|
|
phase and keep debug suites opt-in.
|
|
- Risk: Missing test suites hide regressions. Mitigation: inventory now
|
|
includes all suites and maps them to phases.
|
|
|
|
## 8. Dependencies and Impacted Files
|
|
|
|
- Harness: [tests/global-setup.ts](tests/global-setup.ts),
|
|
[tests/auth.setup.ts](tests/auth.setup.ts),
|
|
[tests/security-teardown.setup.ts](tests/security-teardown.setup.ts).
|
|
- Core UI: [tests/core](tests/core).
|
|
- Settings: [tests/settings](tests/settings).
|
|
- Tasks: [tests/tasks](tests/tasks).
|
|
- Monitoring: [tests/monitoring](tests/monitoring).
|
|
- Security UI: [tests/security](tests/security).
|
|
- Security enforcement: [tests/security-enforcement](tests/security-enforcement).
|
|
- Integration: [tests/integration](tests/integration).
|
|
- Browser-specific: [tests/firefox-specific](tests/firefox-specific),
|
|
[tests/webkit-specific](tests/webkit-specific).
|
|
|
|
## 9. Confidence Score
|
|
|
|
Confidence: 79 percent
|
|
|
|
Rationale: The suite inventory and dependencies are well understood. The main
|
|
unknowns are timing-sensitive security propagation and emergency server
|
|
availability in varied environments.
|