chore: add CI-parity Playwright tasks for Firefox non-security shard execution

This commit is contained in:
GitHub Actions
2026-02-15 07:32:20 +00:00
parent 86528433c1
commit bd2d336abe
2 changed files with 201 additions and 458 deletions

View File

@@ -1,498 +1,181 @@
---
post_title: Full E2E Green Execution Spec
post_title: Local CI-Parity Playwright Shard Task Set Spec
author1: "Charon Team"
post_slug: full-e2e-green-execution-spec
post_slug: local-ci-parity-playwright-task-spec
categories:
- testing
- quality
- remediation
- ci
- actions
tags:
- playwright
- e2e
- ci
- go
- react
summary: "Execution-ready specification to drive Charon E2E to fully green across browser and security projects, including test remediation, backend/frontend fixes, deterministic state controls, and CI/config hygiene updates."
post_date: "2026-02-13"
- vscode-tasks
- firefox
- ci-parity
summary: "Concise implementation plan to add five VS Code tasks for CI-like Firefox non-security shard execution: one sequential 1/4..4/4 runner and four per-shard triage tasks, all with CI-parity environment variables and explicit test paths."
post_date: "2026-02-15"
---
## 1. Introduction
This document replaces the previous skip-focused plan and defines the end-to-end execution strategy to get Charon Playwright suites fully green with no hidden skip debt.
Add five minimal VS Code tasks in `.vscode/tasks.json` that reproduce the CI Firefox
non-security shard execution locally for `/projects/Charon`.
Objective:
Goals:
- QA unblock scope is a frozen target matrix that MUST finish with `0 failed`, `0 skipped`, and `0 did-not-run`.
- All frozen-matrix E2E suites execute and pass in the exact mapped Playwright projects.
- Security suites run only in `security-tests` where required.
- Browser suites (`chromium`, `firefox`, `webkit`) are deterministic and flake-resistant.
- Configuration files (`.gitignore`, `codecov.yml`, `.dockerignore`, `Dockerfile`) are aligned to reduce CI noise and improve repeatability.
- Match CI env parity for these runs: `CI=true`,
`PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080`,
- `CHARON_SECURITY_TESTS_ENABLED=false`, and shard-specific
`TEST_WORKER_INDEX` values (`1..4`).
- Add one sequential task that executes Firefox shards `1/4` through `4/4` in order.
- Add four triage tasks, one per shard (`1/4`, `2/4`, `3/4`, `4/4`).
- Use deterministic Playwright output paths:
`playwright-output/firefox-shard-1` through
`playwright-output/firefox-shard-4`.
- Use explicit test path arguments from the active CI workflow non-security list.
- Keep task definition style aligned with existing Playwright tasks.
- Keep scope minimal to `.vscode/tasks.json` only.
Primary source artifacts reviewed:
## 1.1 Prerequisites
- `docs/plans/CI_REMEDIATION_MASTER_PLAN.md`
- `docs/reports/e2e_skip_registry_2026-02-13.md`
- `docs/reports/e2e_fail_skip_ledger_2026-02-13.md`
Before task execution, the E2E runtime MUST be healthy per
`.github/instructions/testing.instructions.md`:
- Use existing healthy `charon-e2e` container, OR
- Rebuild/start it with:
```bash
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e
```
## 2. Research Findings
### 2.1 Existing architecture and execution topology
Source of truth for CI-like command shape and env:
Backend and route wiring (`backend/internal/api/routes/routes.go`) confirms:
- `.github/workflows/e2e-tests-split.yml` (`e2e-firefox`, non-security shard job).
- Auth/session APIs used heavily by E2E:
- `POST /api/v1/auth/login`
- `POST /api/v1/auth/logout`
- `POST /api/v1/auth/refresh`
- `GET /api/v1/auth/me`
- Security feature APIs relied on by settings/security workflows:
- `GET /api/v1/security/status`
- `PATCH /api/v1/security/acl`
- `PATCH /api/v1/security/waf`
- `PATCH /api/v1/security/crowdsec`
- `PATCH /api/v1/security/rate-limit`
- Access list APIs used by workflow/data consistency tests:
- `GET/POST/PUT/DELETE /api/v1/access-lists`
CI-like non-security Firefox path list (must be explicit in every shard command):
Frontend pages/components directly mapped to failing suites:
- `frontend/src/pages/DNSProviders.tsx`
- `frontend/src/components/dns-providers/ManualDNSChallenge.tsx`
- `frontend/src/pages/UsersPage.tsx`
- `frontend/src/components/ProxyHostForm.tsx`
- `frontend/src/pages/Certificates.tsx`
- `frontend/src/components/RequireAuth.tsx`
### 2.2 High-impact suite clusters (evidence-based)
Primary remediation suites:
- `tests/settings/user-lifecycle.spec.ts`
- `tests/core/multi-component-workflows.spec.ts`
- `tests/core/data-consistency.spec.ts`
- `tests/core`
- `tests/dns-provider-crud.spec.ts`
- `tests/dns-provider-types.spec.ts`
- `tests/integration`
- `tests/manual-dns-provider.spec.ts`
- `tests/core/admin-onboarding.spec.ts`
- `tests/modal-dropdown-triage.spec.ts`
- `tests/core/certificates.spec.ts`
- `tests/core/authentication.spec.ts`
- `tests/core/navigation.spec.ts`
- `tests/monitoring`
- `tests/settings`
- `tests/tasks`
Authentication/session fixture dependency surface:
Current task style in `.vscode/tasks.json` for Playwright tasks uses:
- `tests/fixtures/auth-fixtures.ts`
- Core helpers: `getAuthToken`, `loginWithCredentials`, `refreshTokenIfNeeded`, `loginUser`, `logoutUser`, `createUserViaApi`
- `type: "shell"`
- `group: "test"`
- `problemMatcher: []`
- `presentation` with `reveal: "always"`, `panel: "dedicated"`, `close: false`
Observed failure patterns from code and baseline reports:
## 3. Technical Specification
- Race/readiness failures around `auth/me` and post-login state transitions.
- Mixed security toggle propagation timing (settings updates not immediately reflected in UI/API assertions).
- Manual DNS tests relying on route mocks and challenge visibility that can drift by timing/state.
- Dropdown/modal selectors in triage suites using brittle targeting patterns.
- Certificate suite includes permissive expectations that mask true failures.
### 3.1 EARS Requirements
### 2.3 Config and pipeline hygiene findings
- WHEN the developer runs the sequential VS Code task, THE SYSTEM SHALL execute
four `npx playwright test` shard commands with `--project=firefox` in order
(`1/4`, `2/4`, `3/4`, `4/4`) using `&&` chaining.
- WHEN the developer runs any per-shard triage task, THE SYSTEM SHALL execute
exactly one matching shard command with `--project=firefox` and
`--shard=N/4`.
- WHEN any task executes, THE SYSTEM SHALL set
`CI=true`, `PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080`, and
`CHARON_SECURITY_TESTS_ENABLED=false`, and shard-appropriate
`TEST_WORKER_INDEX` for command execution.
- WHEN any shard command executes, THE SYSTEM SHALL pass deterministic Playwright
output path `--output=playwright-output/firefox-shard-N` for that shard.
- WHEN any task executes, THE SYSTEM SHALL pass the explicit CI-like
non-security test paths and SHALL NOT include security-only directories.
- `playwright.config.js` already separates `security-tests` and browser projects correctly.
- `.gitignore` currently allows root-level scan/report artifacts to linger and pollute workspace/PR diffs.
- `.dockerignore` should prioritize excluding test/output/docs-heavy artifacts from build context.
- `codecov.yml` is strict on patch/project coverage and needs explicit handling of E2E/generated noise.
- `Dockerfile` supports runtime, but E2E reproducibility depends on deterministic env/runtime contracts and avoiding accidental context bloat.
### 3.2 Planned Task Additions
## 3. Technical Specifications
File to modify:
### 3.1 EARS requirements
- `.vscode/tasks.json`
- WHEN Playwright executes browser projects, THE SYSTEM SHALL run only browser-targeted suites and produce deterministic results with no retry masking.
- WHEN Playwright executes `security-tests`, THE SYSTEM SHALL run security-only suites with Cerberus-enabled state and explicit preconditions.
- WHEN an auth-dependent test starts, THE SYSTEM SHALL confirm valid auth token/session and successful `GET /api/v1/auth/me` before asserting UI state.
- WHEN wildcard/manual DNS scenarios are tested, THE SYSTEM SHALL provide deterministic challenge state using scoped mocks and verified cleanup.
- IF a security toggle is changed from UI/API, THEN THE SYSTEM SHALL expose a deterministic synchronization point before downstream assertions.
- IF a test cannot satisfy preconditions, THEN THE SYSTEM SHALL fail with explicit diagnostics and SHALL NOT use skip masking, quarantine, or did-not-run allowances.
Planned labels and exact command strings:
### 3.7 Frozen QA Unblock Matrix Artifact
- Label: `Test: E2E Playwright (FireFox) - CI Parity Non-Security Shards 1-4 (Sequential)`
Command:
`cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=1 npx playwright test --project=firefox --shard=1/4 --output=playwright-output/firefox-shard-1 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=2 npx playwright test --project=firefox --shard=2/4 --output=playwright-output/firefox-shard-2 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=3 npx playwright test --project=firefox --shard=3/4 --output=playwright-output/firefox-shard-3 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=4 npx playwright test --project=firefox --shard=4/4 --output=playwright-output/firefox-shard-4 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks`
Artifact ID: `QA_UNBLOCK_MATRIX_FROZEN_2026-02-13`
- Label: `Test: E2E Playwright (FireFox) - CI Parity Non-Security Shard 1/4`
Command:
`cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=1 npx playwright test --project=firefox --shard=1/4 --output=playwright-output/firefox-shard-1 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks`
Policy:
- Label: `Test: E2E Playwright (FireFox) - CI Parity Non-Security Shard 2/4`
Command:
`cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=2 npx playwright test --project=firefox --shard=2/4 --output=playwright-output/firefox-shard-2 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks`
- This matrix is frozen for QA unblock and cannot be expanded or reduced during execution.
- Every row must execute and finish with expected status `pass`.
- Any `failed`, `skipped`, `timedOut`, `interrupted`, or `did-not-run` result fails the gate.
- Label: `Test: E2E Playwright (FireFox) - CI Parity Non-Security Shard 3/4`
Command:
`cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=3 npx playwright test --project=firefox --shard=3/4 --output=playwright-output/firefox-shard-3 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks`
| Suite | Project | Expected Status |
|---|---|---|
| `tests/settings/user-lifecycle.spec.ts` | `chromium` | `pass` |
| `tests/settings/user-lifecycle.spec.ts` | `firefox` | `pass` |
| `tests/settings/user-lifecycle.spec.ts` | `webkit` | `pass` |
| `tests/core/multi-component-workflows.spec.ts` | `chromium` | `pass` |
| `tests/core/multi-component-workflows.spec.ts` | `firefox` | `pass` |
| `tests/core/multi-component-workflows.spec.ts` | `webkit` | `pass` |
| `tests/core/data-consistency.spec.ts` | `chromium` | `pass` |
| `tests/core/data-consistency.spec.ts` | `firefox` | `pass` |
| `tests/core/data-consistency.spec.ts` | `webkit` | `pass` |
| `tests/manual-dns-provider.spec.ts` | `chromium` | `pass` |
| `tests/manual-dns-provider.spec.ts` | `firefox` | `pass` |
| `tests/manual-dns-provider.spec.ts` | `webkit` | `pass` |
| `tests/core/admin-onboarding.spec.ts` | `chromium` | `pass` |
| `tests/core/admin-onboarding.spec.ts` | `firefox` | `pass` |
| `tests/core/admin-onboarding.spec.ts` | `webkit` | `pass` |
| `tests/modal-dropdown-triage.spec.ts` | `chromium` | `pass` |
| `tests/modal-dropdown-triage.spec.ts` | `firefox` | `pass` |
| `tests/modal-dropdown-triage.spec.ts` | `webkit` | `pass` |
| `tests/core/certificates.spec.ts` | `chromium` | `pass` |
| `tests/core/certificates.spec.ts` | `firefox` | `pass` |
| `tests/core/certificates.spec.ts` | `webkit` | `pass` |
| `tests/core/authentication.spec.ts` | `chromium` | `pass` |
| `tests/core/authentication.spec.ts` | `firefox` | `pass` |
| `tests/core/authentication.spec.ts` | `webkit` | `pass` |
| `tests/core/navigation.spec.ts` | `chromium` | `pass` |
| `tests/core/navigation.spec.ts` | `firefox` | `pass` |
| `tests/core/navigation.spec.ts` | `webkit` | `pass` |
- Label: `Test: E2E Playwright (FireFox) - CI Parity Non-Security Shard 4/4`
Command:
`cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=4 npx playwright test --project=firefox --shard=4/4 --output=playwright-output/firefox-shard-4 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks`
### 3.2 API and contract requirements (no new endpoint required unless explicitly noted)
Planned task structure:
Required stable contracts (must be treated as blockers if unstable):
- `GET /api/v1/auth/me`: must return 200 with consistent user payload after login refresh boundaries.
- `GET /api/v1/security/status`: must reflect toggle changes within bounded synchronization window.
- `PATCH /api/v1/security/*`: must return deterministic success/failure and invalidate relevant cache.
- `GET/POST /api/v1/access-lists` and related endpoints: must be strongly consistent for immediate read-after-write assertions used by multi-component workflows.
Optional contract hardening (only if required by failures):
- Add explicit operation-complete payload fields for security patch endpoints (for deterministic UI waiters).
### 3.3 Database schema expectations
No schema migration is planned by default.
Escalation rule:
- IF auth/security consistency issues are traced to persistence-layer defaults or stale records, THEN create a separate migration spec before code changes.
### 3.4 Component-level design responsibilities
#### Frontend focus areas
- `frontend/src/components/RequireAuth.tsx`
- Ensure auth gate uses a single source of truth for token + user state readiness.
- `frontend/src/pages/DNSProviders.tsx`
- Stabilize manual challenge visibility/load path and fallback behavior.
- `frontend/src/components/dns-providers/ManualDNSChallenge.tsx`
- Stabilize status transitions (`pending``verifying` → terminal states) for testability.
- `frontend/src/pages/UsersPage.tsx`
- Modal/selection reliability and deterministic host permission rendering.
- `frontend/src/components/ProxyHostForm.tsx`
- Selector stability and predictable async behavior for domain/provider/dropdowns.
- `frontend/src/pages/Certificates.tsx` + dependent list components
- Deterministic list/loading states and no permissive pass conditions.
#### Backend focus areas
- `backend/internal/api/handlers/auth_handler.go`
- Session cookie/token lifecycle consistency (`login`, `refresh`, `me`, `logout`).
- `backend/internal/api/handlers/security_handler.go`
- Toggle/cache invalidation and observable state transition timing.
- `backend/internal/api/handlers/access_list_handler.go`
- Stable CRUD/test behavior under immediate read-after-write.
### 3.5 Data flow and synchronization design
```mermaid
flowchart TD
A[Playwright setup auth fixture] --> B[Login/token acquisition]
B --> C[Auth readiness gate: auth/me 200 + UI ready]
C --> D{Suite type}
D -->|Browser projects| E[Core/UI flows: DNS, users, certificates, navigation]
D -->|security-tests| F[Security enforcement and emergency flows]
E --> G[Deterministic UI/API assertions]
F --> G
G --> H[Targeted rerun x2 gate]
H --> I[Pre-QA green approved]
```
Synchronization requirements:
- Replace ad-hoc sleeps with API-backed waiters and stable UI readiness signals.
- Keep route mocking test-scoped and paired cleanup (`route`/`unroute`).
### 3.6 Error handling and edge-case matrix
| Area | Edge Case | Required Handling |
|---|---|---|
| Auth | token present but stale user state | force refresh path then re-check `auth/me` |
| Auth | cookie vs localStorage divergence | unify guard and fixture refresh behavior |
| Security toggles | API success but stale status read | explicit poll window with fail-fast timeout |
| Manual DNS | no active challenge found | deterministic challenge seed or scoped fallback mock |
| Modals/dropdowns | element attached but not interactable | role-based locator and visible+enabled precondition |
| Certificates | permissive expectation masks fail | replace permissive assertions with strict contract checks |
- Follow existing Playwright task JSON keys and ordering pattern already used in
`.vscode/tasks.json`.
- Keep change scope strictly to task objects in `.vscode/tasks.json`.
## 4. Implementation Plan
### Phase 0: Pre-run environment gate (mandatory)
Owner: DevOps
Work packets:
1. Apply testing protocol rebuild decision before any matrix execution:
- Rebuild E2E container if app/runtime/build inputs changed, or if container state is not healthy/trusted.
- Reuse running container only for test-only changes when health is already confirmed.
2. Verify runtime health before matrix runs:
- Management UI health endpoint reachable (`:8080`).
- Emergency endpoint reachable (`:2020`) when required by targeted tests.
- Container health status is `healthy`.
3. Persist environment-gate verdict in execution log: `rebuild-required` or `reuse-allowed` with evidence.
Gate:
- No Phase 1 start until rebuild decision and health verification both pass.
Handoff criteria:
- DevOps provides a pass/fail environment gate record consumed by QA Security in Phase 5.
Complexity: Low
### Phase 1: Playwright behavior contract and baseline capture (mandatory first)
1. Capture fresh fail/skip ledger for target suites only.
2. Freeze target suite list and expected project mapping.
3. Define precondition contract in tests before feature-level edits.
Deliverables:
- Updated fail/skip matrix appended to `docs/reports/e2e_fail_skip_ledger_2026-02-13.md`.
- Explicit project-routing map per suite.
Complexity: Medium
### Phase 2: Backend remediation (auth + security + ACL consistency)
Work packets:
1. Auth reliability:
- Files: `backend/internal/api/handlers/auth_handler.go`, auth service dependencies.
- Goal: eliminate intermittent `auth/me` readiness failures post-login/refresh/logout cycles.
2. Security state propagation:
- Files: `backend/internal/api/handlers/security_handler.go`.
- Goal: deterministic status observability after patch/enable/disable actions.
3. Access list consistency:
- Files: `backend/internal/api/handlers/access_list_handler.go` and service layer.
- Goal: immediate read-after-write consistency for tests.
Validation:
- Targeted Go tests for changed packages.
- Targeted Playwright suites that consume these APIs.
Complexity: High
### Phase 3: Frontend remediation (state, selectors, deterministic UX)
Work packets:
1. Auth guard and lifecycle:
- Files: `frontend/src/components/RequireAuth.tsx`, auth hooks/store dependencies.
2. Manual DNS flow stabilization:
- Files: `frontend/src/pages/DNSProviders.tsx`, `frontend/src/components/dns-providers/ManualDNSChallenge.tsx`.
3. Modal/dropdown hardening:
- Files: `frontend/src/pages/UsersPage.tsx`, `frontend/src/components/ProxyHostForm.tsx`.
4. Certificates UX contract:
- Files: `frontend/src/pages/Certificates.tsx` and certificate list dependencies.
Validation:
- Frontend lint + TS checks.
- Targeted Playwright runs on affected suites.
Complexity: High
### Phase 4: Test suite hardening and flake elimination
Work packets:
1. Auth fixture hardening:
- File: `tests/fixtures/auth-fixtures.ts`.
- Goal: centralize token refresh/readiness checks and remove duplicate race-prone paths.
2. Manual DNS test alignment:
- File: `tests/manual-dns-provider.spec.ts`.
- Goal: deterministic challenge setup, strict assertions, no skip masking.
3. Workflow/data consistency synchronization:
- Files: `tests/core/multi-component-workflows.spec.ts`, `tests/core/data-consistency.spec.ts`.
- Goal: API-backed sync points, eliminate timing flake.
4. Triage and strictness:
- Files: `tests/modal-dropdown-triage.spec.ts`, `tests/core/certificates.spec.ts`.
- Goal: robust locators, remove permissive success conditions.
Validation:
- Execute targeted suites across all browser projects.
- Repeat run twice; both runs must be green.
Complexity: High
### Phase 5: QA Security ownership, gate validation, and unblock sign-off
Owner: QA Security
Work packets:
1. Validate execution strictly against `QA_UNBLOCK_MATRIX_FROZEN_2026-02-13`.
2. Verify determinism policy enforcement:
- No retry masking (`--retries=0` for gate runs).
- No quarantine lists or temporary excludes.
- No did-not-run allowance for any frozen matrix row.
3. Confirm frozen matrix success scope for unblock:
- Aggregate result is exactly `0 failed / 0 skipped / 0 did-not-run`.
Gate:
- QA unblock is denied unless the frozen matrix exactly matches expected `pass` for all rows.
Handoff criteria:
- QA Security publishes signed gate verdict: `QA_UNBLOCK_APPROVED` or `QA_UNBLOCK_REJECTED`, with matrix evidence.
Complexity: Medium
### Phase 6: DevOps ownership for CI parity and handoff
Owner: DevOps
Work packets:
1. Reconcile outputs with:
- `docs/plans/CI_REMEDIATION_MASTER_PLAN.md`
- `docs/reports/e2e_skip_registry_2026-02-13.md`
- `docs/reports/e2e_fail_skip_ledger_2026-02-13.md`
2. Confirm no reintroduced skip debt in targeted suites.
3. Verify CI command parity with local execution.
4. Ensure CI gate commands validate against the same frozen matrix and determinism policy.
Gate:
- No Supervisor handoff until CI parity and frozen-matrix enforcement are confirmed.
Handoff criteria:
- DevOps provides final execution package with environment-gate record, QA Security verdict, and CI parity evidence.
Complexity: Medium
## 5. Config Review and Required Recommendations
### 5.1 `.gitignore`
Recommendation:
- Add/normalize ignores for root-level generated outputs that should never be committed:
- `playwright-report/`, `test-results/`, `.playwright-artifacts/` (if used), `coverage/e2e/` artifacts policy-defined.
- security scan outputs and temporary SARIF/JSON/TXT reports generated during local runs.
Rationale:
- Reduce PR noise and prevent stale artifact interference with triage.
### 5.2 `codecov.yml`
Recommendation:
- Keep strict patch coverage policy; do not relax thresholds.
- Ensure generated E2E artifacts and transient files are excluded consistently from coverage paths.
- Add explicit patch triage process in plan execution notes (copy missing lines from Codecov Patch view to task list).
Rationale:
- Preserve quality gate while preventing false negatives from non-source artifacts.
### 5.3 `.dockerignore`
Recommendation:
- Exclude non-runtime directories from build context where safe:
- large docs/report outputs, Playwright artifacts, local test outputs, and temporary scan files.
- Keep only build/runtime-essential files in Docker context for reproducibility and speed.
Rationale:
- Faster deterministic builds and reduced accidental cache invalidation.
### 5.4 `Dockerfile`
Recommendation:
- Keep image behavior stable; avoid introducing test-only variability.
- Validate that runtime env defaults required by E2E are explicit and reproducible.
- Ensure no unnecessary build context dependencies remain after `.dockerignore` tightening.
Rationale:
- E2E reliability depends on predictable runtime behavior, not ad-hoc local state.
## 6. Subagent Execution Matrix
| Subagent | Scope | File Focus | Exit Criteria |
|---|---|---|---|
| Playwright | test hardening + deterministic waits | `tests/**`, `tests/fixtures/auth-fixtures.ts` | target suites green x2 |
| Backend | auth/security/ACL consistency | `backend/internal/api/handlers/**`, service deps | API contracts stable under targeted runs |
| Frontend | state and interaction reliability | `frontend/src/pages/**`, `frontend/src/components/**` | deterministic UI behavior in target suites |
| QA Security | frozen-matrix gate enforcement + unblock decision | `docs/plans/current_spec.md`, Playwright run artifacts, matrix evidence | `0 failed / 0 skipped / 0 did-not-run` on frozen matrix and signed QA verdict |
| DevOps | environment gate + CI parity + release handoff | `.docker/compose/**`, `playwright.config.js`, `.gitignore`, `.dockerignore`, `codecov.yml`, `Dockerfile`, docs reports | environment gate pass + CI parity pass + handoff package delivered |
## 7. Validation Strategy
Execution order:
1. Run Phase 0 environment gate (rebuild decision + health verification).
2. Execute frozen matrix artifact rows in mapped projects with `--retries=0`.
3. Run security-targeted set in `security-tests`.
4. Repeat full frozen matrix a second time (must also pass).
5. Run lint/typecheck and relevant backend tests.
Determinism gate rule:
- No retry masking, no quarantine, no did-not-run allowance.
QA gate rule:
- No QA handoff until two consecutive frozen-matrix green runs are achieved with exact scope match.
## 8. Acceptance Criteria
- [ ] Frozen matrix (`QA_UNBLOCK_MATRIX_FROZEN_2026-02-13`) completes with `0 failed / 0 skipped / 0 did-not-run`.
- [ ] All frozen matrix rows execute and pass in exact suite-to-project mapping.
- [ ] `auth/me` readiness failures are eliminated in user lifecycle flows.
- [ ] Manual DNS provider tests run deterministically without skip masking.
- [ ] Security toggle propagation is deterministic for workflow/data consistency suites.
- [ ] Dropdown/modal triage scenarios are stable with robust selectors/interactions.
- [ ] Certificate tests use strict assertions (no permissive masking patterns).
- [ ] Determinism policy is enforced: no retries for gate runs, no quarantine, no did-not-run allowance.
- [ ] Phase 0 pre-run environment gate evidence is present and valid.
- [ ] QA Security gate verdict is recorded and approved for unblock.
- [ ] DevOps CI parity gate verdict is recorded before Supervisor handoff.
- [ ] `.gitignore`, `.dockerignore`, `codecov.yml`, and `Dockerfile` recommendations are implemented and validated.
- [ ] Baseline docs/reports are updated to reflect final green state.
- [ ] Pre-QA green gate passes twice consecutively.
## 9. Risks and Mitigations
- Risk: Hidden coupling between fixtures and UI state causes intermittent regressions.
- Mitigation: centralize readiness gates and remove duplicated auth logic.
- Risk: Security state propagation latency causes false negatives.
- Mitigation: bounded poll contracts and backend cache invalidation checks.
- Risk: Overfitting tests to implementation details.
- Mitigation: prefer user-facing role/label locators and API-level readiness only.
## 10. Handoff
Decision summary (for Supervisor review):
- Decision: Replace skip-retarget-only plan with full green-suite execution spec spanning backend, frontend, tests, and config hygiene.
- Rationale: Current blockers are not only skip/routing issues; they include product behavior and determinism gaps.
- Impact: Enables parallel subagent execution with explicit ownership and measurable gates.
- Review target: Supervisor agent validates task sequencing, ownership, and gate criteria before implementation begins.
Next action:
- Submit this plan to Supervisor for approval, then execute phases in order with strict gate enforcement.
1. Add one sequential shard task object in `.vscode/tasks.json` adjacent to
existing Playwright test tasks.
2. Add four per-shard triage task objects (`1/4`..`4/4`) in the same section.
3. Copy existing Playwright task structure (`group`, `problemMatcher`,
`presentation`) to maintain consistency for all five tasks.
4. Insert CI-parity env vars and explicit non-security path list in each task
`command`.
5. Keep all other files unchanged.
## 5. Validation
Validation commands (direct shell equivalents):
```bash
cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=1 npx playwright test --project=firefox --shard=1/4 --output=playwright-output/firefox-shard-1 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks
cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=2 npx playwright test --project=firefox --shard=2/4 --output=playwright-output/firefox-shard-2 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks
cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=3 npx playwright test --project=firefox --shard=3/4 --output=playwright-output/firefox-shard-3 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks
cd /projects/Charon && CI=true PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080 CHARON_SECURITY_TESTS_ENABLED=false TEST_WORKER_INDEX=4 npx playwright test --project=firefox --shard=4/4 --output=playwright-output/firefox-shard-4 tests/core tests/dns-provider-crud.spec.ts tests/dns-provider-types.spec.ts tests/integration tests/manual-dns-provider.spec.ts tests/monitoring tests/settings tests/tasks
```
Expected behavior:
- Sequential task runs Firefox shards `1/4` to `4/4` in order, stopping on first
failing shard due to `&&` chaining.
- Per-shard tasks run only their respective shard for triage speed.
- Only the listed non-security paths are included in the run.
- Execution targets `http://127.0.0.1:8080`.
- Security toggle is disabled for all runs
(`CHARON_SECURITY_TESTS_ENABLED=false`).
- Worker index is explicitly set per shard (`TEST_WORKER_INDEX=1..4`).
- Playwright artifacts are written to deterministic per-shard paths
(`playwright-output/firefox-shard-1` through
`playwright-output/firefox-shard-4`).
- Process exits `0` when invoked shard(s) pass; non-zero on test failure.
## 6. Acceptance Criteria
- [ ] `.vscode/tasks.json` includes exactly five new CI-parity Firefox
non-security tasks: one sequential runner and four per-shard triage tasks.
- [ ] VS Code task execution succeeds (`exit code 0`) in a healthy E2E runtime,
with prerequisites satisfied (healthy container or `docker-rebuild-e2e`).
- [ ] Every task command includes all explicit parity fields:
`CI=true`, `PLAYWRIGHT_BASE_URL=http://127.0.0.1:8080`,
`CHARON_SECURITY_TESTS_ENABLED=false`, matching `--shard=N/4`,
matching `--output=playwright-output/firefox-shard-N`, matching
`TEST_WORKER_INDEX=N`, and explicit non-security test paths.
- [ ] Task format and style match existing Playwright tasks in the repository.
- [ ] Plan scope remains minimal: implementation changes are limited to
`.vscode/tasks.json`.
- [ ] Manual run (task or equivalent command) behaves as expected.