50 KiB
CI Remediation Master Plan
Status: 🔴 BLOCKED - CI failures preventing releases Created: February 12, 2026 Last Updated: February 13, 2026 Priority: CRITICAL (P0)
Status Overview
Target: 100% Pass Rate (0 failures, 0 skipped) Current (latest full rerun): 1500 passed, 62 failed, 50 skipped Current (Phase 2 targeted Chromium rerun): 17 passed, 1 failed Blockers: Cross-browser E2E instability + unresolved skip debt + Phase 2 user lifecycle regression
Progress Tracker
- Phase 1: Security Fixes (8 items) - PRIORITY 0 - Est. 7-10 hours
- Phase 2: High-Impact E2E (17 items) - PRIORITY 1 - Est. 7-10 hours
- Phase 3: Medium-Impact E2E (6 items) - PRIORITY 2 - Est. 3-5 hours
- Phase 4: Low-Impact E2E (5 items) - PRIORITY 3 - Est. 2-3 hours
- Phase 5: Final Validation & CI Approval - MANDATORY - Est. 2-3 hours
- [-] Phase 6: Fail & Skip Census (Research) - MANDATORY - Est. 2-4 hours
- Phase 7: Failure Cluster Remediation (Execution) - MANDATORY - Est. 8-16 hours
- Phase 8: Skip Debt Burn-down & Re-enable - MANDATORY - Est. 4-8 hours
- Phase 9: Final Re-baseline & CI Gate Freeze - MANDATORY - Est. 2-4 hours
Current Phase: Phase 6 - Fail & Skip Census (skip registry created; full skip enumeration pending) Estimated Total Time: 37-68 hours (including new Phases 6-9) Target Completion: Within 7-10 business days (split across team)
Phase 1: Security Fixes (PRIORITY 0)
Overview
Total Items: 8 (4 ACL API endpoints + 4 broken imports) Current Pass Rate: 94.2% (65/69 tests passing) Target: 100% (69/69 tests passing) Owner: Backend Dev (API) + Frontend Dev (Imports) Status: 🟡 In Progress
Task 1.1: Fix ACL Security Status Endpoint
File: backend/internal/routes/security.go
Issue: GET /api/v1/security/status returns 404
Tests Failing: 2 tests in tests/security-enforcement/acl-enforcement.spec.ts
Owner: Backend Dev
Priority: HIGH
Estimated Time: 2 hours
Root Cause: API endpoint missing or not exposed. Frontend ACL UI tests pass (22/22), but API enforcement tests fail because the backend endpoint doesn't exist.
Implementation Steps:
-
Create route handler in
backend/internal/routes/security.go:func GetSecurityStatus(c *gin.Context) { // Retrieve current security module states from config status := map[string]interface{}{ "cerberus": map[string]bool{"enabled": getCerberusEnabled()}, "acl": map[string]interface{}{"enabled": getACLEnabled(), "mode": getACLMode()}, "waf": map[string]bool{"enabled": getWAFEnabled()}, "rateLimit": map[string]bool{"enabled": getRateLimitEnabled()}, "crowdsec": map[string]interface{}{"enabled": getCrowdSecEnabled(), "mode": getCrowdSecMode()}, } c.JSON(200, status) } -
Register route in router setup:
authorized.GET("/security/status", GetSecurityStatus) -
Add authentication middleware (already required by
authorizedgroup) -
Write unit tests in
backend/internal/routes/security_test.go
Validation Command:
# Run the 2 failing tests
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium --grep "should verify ACL is enabled"
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium --grep "should return security status"
Acceptance Criteria:
- API endpoint returns 200 status code
- JSON response contains all security module states (cerberus, acl, waf, rateLimit, crowdsec)
- Response includes ACL mode ("allow" or "deny")
- Authentication middleware enforced (401 without valid token)
- 2 ACL enforcement tests pass
- No new test failures introduced
- Backend unit tests written and passing
Task 1.2: Fix ACL Access Lists Endpoint
File: backend/internal/routes/access_lists.go
Issue: GET /api/v1/access-lists returns 404
Tests Failing: 2 tests in tests/security-enforcement/acl-enforcement.spec.ts
Owner: Backend Dev
Priority: HIGH
Estimated Time: 2 hours
Root Cause: API endpoint missing. Tests expect to list access lists and test IP addresses against ACL rules, but endpoint doesn't exist.
Implementation Steps:
-
Create route handler in
backend/internal/routes/access_lists.go:func GetAccessLists(c *gin.Context) { // Query database for ACL entries var accessLists []models.AccessList result := db.Find(&accessLists) if result.Error != nil { c.JSON(500, gin.H{"error": "Failed to fetch access lists"}) return } c.JSON(200, accessLists) } -
Register route in router setup:
authorized.GET("/access-lists", GetAccessLists) -
Add optional filtering by proxy_host_id (query param)
-
Write unit tests in
backend/internal/routes/access_lists_test.go
Validation Command:
# Run the 2 failing tests
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium --grep "should list access lists when ACL enabled"
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium --grep "should test IP against access list"
Acceptance Criteria:
- API endpoint returns 200 status code
- JSON response is array of access list objects
- Each object includes: id, name, mode, ips, proxy_hosts
- Empty array returned when no ACLs exist (not 404)
- Authentication middleware enforced
- 2 ACL enforcement tests pass
- No new test failures introduced
- Backend unit tests written and passing
Task 1.3: Fix ACL Test IP Endpoint (Optional)
File: backend/internal/routes/access_lists.go
Issue: POST /api/v1/access-lists/:id/test may be needed for IP testing
Tests Potentially Needing This: Part of "test IP against access list" test
Owner: Backend Dev
Priority: MEDIUM
Estimated Time: 1 hour
Note: This may not be a separate endpoint - the test might just be checking if GET /access-lists works. Investigate Task 1.2 first to determine if this is needed.
Implementation Steps (if needed):
-
Create route handler:
func TestIPAgainstACL(c *gin.Context) { aclID := c.Param("id") var req struct { IP string `json:"ip" binding:"required"` } if err := c.ShouldBindJSON(&req); err != nil { c.JSON(400, gin.H{"error": "Invalid IP format"}) return } // Test IP against ACL rules using CIDR matching allowed, reason := testIPAgainstACL(aclID, req.IP) c.JSON(200, gin.H{"allowed": allowed, "reason": reason}) } -
Implement CIDR matching logic for IP testing
Validation Command:
# Run after Task 1.2 to see if this is needed
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium --grep "should test IP against access list"
Acceptance Criteria:
- Determine if endpoint is actually needed (may be covered by Task 1.2)
- If needed: Endpoint validates IP format (400 for invalid)
- If needed: Returns allow/deny result with reason
- Test passes without this endpoint, OR endpoint implemented if required
Task 1.4: Fix Broken Import Paths in zzz-caddy-imports
Files:
tests/security-enforcement/zzz-caddy-imports/caddy-import-cross-browser.spec.tstests/security-enforcement/zzz-caddy-imports/caddy-import-firefox.spec.tstests/security-enforcement/zzz-caddy-imports/caddy-import-gaps.spec.tstests/security-enforcement/zzz-caddy-imports/caddy-import-webkit.spec.ts
Issue: All 4 files import from '../fixtures/auth-fixtures' (wrong path)
Owner: Frontend Dev / QA
Priority: MEDIUM
Estimated Time: 0.5 hours (30 minutes)
Root Cause:
Import paths are missing one level. Files are in tests/security-enforcement/zzz-caddy-imports/, but fixtures are in tests/fixtures/, requiring ../../fixtures/ instead of ../fixtures/.
Implementation Steps:
-
Fix import paths in all 4 files:
- import { test, expect, loginUser } from '../fixtures/auth-fixtures'; + import { test, expect, loginUser } from '../../fixtures/auth-fixtures'; -
Verify import resolution (files should load without errors)
-
Run tests to ensure no new failures introduced
Validation Command:
# Run all 4 caddy-import tests
npx playwright test tests/security-enforcement/zzz-caddy-imports/ --project=chromium
Acceptance Criteria:
- All 4 files have corrected import paths to
../../fixtures/auth-fixtures - TypeScript compilation successful (no import errors)
- Tests run without import resolution errors
- No new test failures introduced by path fixes
- Clean
npm run type-checkoutput
Phase 1 Summary
Total Tasks: 4 Total Estimated Time: 5.5-7 hours Critical Path: Tasks 1.1 → 1.2 (API endpoints) must complete before Task 1.4 (imports) can be fully validated
Phase 1 Validation Command:
# Run all security tests to verify 100% pass rate
npx playwright test tests/security/ tests/security-enforcement/ --project=chromium
# Expected: 69/69 tests passing (100%)
Phase 1 Exit Criteria:
- All 4 ACL API endpoint tests passing
- All 4 caddy-import tests running without import errors
- Total security test pass rate: 100% (69/69)
- No new failures introduced in other test suites
- Backend unit tests passing for new API endpoints
- Git commit:
fix(security): implement missing ACL API endpoints + fix import paths
Phase 2: High-Impact E2E (PRIORITY 1)
Overview
Total Failures: 17 (7 + 5 + 5) Categories: User Lifecycle (7) + Multi-Component Workflows (5) + Data Consistency (5) Impact: CRITICAL - Security, Authentication, Core CRUD Operations Owner: Playwright Dev + QA Engineer Status: 🔴 Not Started
Task 2.1: Settings - User Lifecycle (7 failures)
File: tests/core/settings-user-lifecycle.spec.ts (assumed path)
Browser: Chromium only (Firefox/WebKit: 0 failures ✅)
Impact: CRITICAL - Security, Authentication, Authorization, Audit Logging
Owner: Playwright Dev
Estimated Time: 3 hours
Root Cause Hypothesis: Browser-specific timing issues. Chromium's faster JavaScript execution may trigger race conditions in authentication state, session management, or permission checks that don't occur in Firefox/WebKit.
Investigation Steps:
-
Run headed to observe behavior:
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=chromium --headed -
Generate trace for analysis:
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=chromium --trace on -
Compare timing vs Firefox (which has 0 failures):
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=firefox --headed -
Check for common patterns:
- Authentication state not fully propagated before assertions
- Session cookies not set before navigation
- Permission checks executing before role assignment completes
- Audit log writes not flushed before reads
Failing Tests (7):
-
Deleted user cannot login
- Expected: 401 or login failure
- May need explicit wait for user deletion to propagate to auth middleware
-
Session persistence after logout and re-login
- Expected: New session created, old session invalidated
- May need
page.waitForLoadState('networkidle')after logout
-
Users see only their own data
- Expected: User A cannot see User B's resources
- May need explicit wait after user creation before data isolation check
-
User cannot promote self to admin
- Expected: 403 Forbidden when non-admin tries role escalation
- May need explicit wait for permission check API call
-
Permissions apply immediately on user refresh
- Expected: Role change → refresh → new permissions active
- May need explicit wait for role update to propagate to session
-
Permissions propagate from creation to resource access
- Expected: New user → assigned role → can access allowed resources
- May need explicit wait after role assignment before resource access
-
Audit log records user lifecycle events
- Expected: User create/update/delete events in audit log
- May need explicit wait for async audit log write to complete
Likely Fix Pattern: Add explicit waits after state-changing operations:
// After user deletion
await page.waitForResponse(resp => resp.url().includes('/api/v1/users') && resp.status() === 200);
await page.waitForTimeout(500); // Allow propagation in Chromium
// After role assignment
await page.waitForResponse(resp => resp.url().includes('/api/v1/users') && resp.request().method() === 'PUT');
await page.context().storageState(); // Ensure session updated
Validation Command:
# Run all 7 tests
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=chromium
# Expected: 7/7 passing
Acceptance Criteria:
- All 7 tests pass in Chromium
- 0 failures remain in Firefox/WebKit (no regressions)
- No test timeout increases beyond 15s per test
- Fix applied consistently across all 7 tests (same pattern)
- Trace analysis confirms timing issues resolved
Task 2.2: Core - Multi-Component Workflows (5 failures)
File: tests/core/multi-component-workflows.spec.ts
Browser: Chromium only (Firefox/WebKit: 0 failures ✅)
Impact: HIGH - Security Module Integration, User Permissions, Backup/Restore
Owner: Playwright Dev
Estimated Time: 2 hours
Root Cause Hypothesis: Complex test scenarios involving multiple async operations (security module toggles, resource creation, permission checks) are timing-sensitive in Chromium.
Investigation Steps:
-
Run headed with debug:
npx playwright test tests/core/multi-component-workflows.spec.ts --project=chromium --headed --debug -
Check previous baseline notes:
- Previous failures showed 8.8-8.9s timeouts
- May need timeout increases or better synchronization
-
Validate security module state propagation:
- Ensure
waitForSecurityModuleEnabled()helper is used - Check Caddy reload completion before assertions
- Ensure
Failing Tests (5):
-
WAF enforcement applies to newly created proxy
- Expected: Create proxy → enable WAF → proxy blocked by WAF
- May need wait for Caddy reload after WAF enable
-
User with proxy creation role can create and manage proxies
- Expected: Role assigned → can create proxy → can manage proxy
- May need explicit wait for permission propagation
-
Backup restore recovers deleted user data
- Expected: Backup → delete data → restore → data recovered
- May need explicit wait for backup completion before restore
-
Security modules apply to subsequently created resources
- Expected: Enable ACL → create proxy → ACL enforced on proxy
- May need wait for security module activation before resource creation
-
Security enforced even on previously created resources
- Expected: Create proxy → enable ACL → ACL enforced on existing proxy
- May need wait for Caddy reload to apply rules to existing resources
Likely Fix Pattern: Add explicit waits for async security operations:
// After security module toggle
await waitForSecurityModuleEnabled(page, 'waf', true);
await page.waitForTimeout(1000); // Caddy reload + propagation
// After backup operation
await page.waitForResponse(resp => resp.url().includes('/api/v1/backup') && resp.status() === 200);
await page.waitForTimeout(500); // Ensure file written
Validation Command:
# Run all 5 tests
npx playwright test tests/core/multi-component-workflows.spec.ts --project=chromium
# Expected: 5/5 passing
Acceptance Criteria:
- All 5 tests pass in Chromium
- 0 failures remain in Firefox/WebKit (no regressions)
- Security module state checked before assertions
- Caddy reload completion verified before enforcement checks
- No timeout increases beyond 30s per test (complex workflows)
Task 2.3: Core - Data Consistency (5 failures)
File: tests/core/data-consistency.spec.ts
Browser: Chromium only (Firefox/WebKit: 0 failures ✅)
Impact: HIGH - Core CRUD Operations, API/UI Synchronization
Owner: Playwright Dev
Estimated Time: 2 hours
Root Cause Hypothesis: Data synchronization delays between API operations and UI updates. Chromium may render UI faster than Firefox, causing assertions to execute before data fully propagated.
Investigation Steps:
-
Run headed to observe data propagation:
npx playwright test tests/core/data-consistency.spec.ts --project=chromium --headed -
Check previous baseline notes:
- Previous failures showed 90s timeout on validation test
- Likely needs better data synchronization waits
-
Validate API/UI sync pattern:
- Ensure
waitForLoadState('networkidle')used after mutations - Check for explicit waits after CRUD operations
- Ensure
Failing Tests (5):
-
Pagination and sorting produce consistent results
- Expected: Sort order and page boundaries match across requests
- May need explicit wait for table render after sort/pagination change
-
Client-side and server-side validation consistent
- Expected: Both UI and API reject invalid data with same messages
- May need explicit wait for server validation response
-
Data stored via API is readable via UI
- Expected: POST /api/v1/resource → refresh UI → see new data
- May need explicit wait for UI data refresh after API mutation
-
Data deleted via UI is removed from API
- Expected: Delete in UI → GET /api/v1/resource → 404
- May need explicit wait for deletion propagation
-
Real-time events reflect partial data updates
- Expected: WebSocket events show incremental changes
- May need explicit wait for WebSocket message receipt
Likely Fix Pattern: Add explicit waits for data synchronization:
// After API mutation
await page.waitForResponse(resp => resp.url().includes('/api/v1/') && resp.request().method() === 'POST');
await page.reload({ waitUntil: 'networkidle' });
// After UI mutation
await page.waitForLoadState('networkidle');
await page.waitForResponse(resp => resp.url().includes('/api/v1/') && resp.request().method() === 'DELETE');
Validation Command:
# Run all 5 tests
npx playwright test tests/core/data-consistency.spec.ts --project=chromium
# Expected: 5/5 passing
Acceptance Criteria:
- All 5 tests pass in Chromium
- 0 failures remain in Firefox/WebKit (no regressions)
- Network idle state checked before assertions
- API/UI synchronization verified with explicit waits
- No timeout increases beyond 30s per test
Phase 2 Summary
Total Tasks: 3 (covering 17 test failures) Total Estimated Time: 7 hours Critical Path: All tasks can run in parallel if multiple devs available
Phase 2 Validation Command:
# Run all high-impact tests
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=chromium
npx playwright test tests/core/multi-component-workflows.spec.ts --project=chromium
npx playwright test tests/core/data-consistency.spec.ts --project=chromium
# Expected: 17/17 tests passing
Phase 2 Exit Criteria:
- All 17 high-impact tests passing in Chromium
- Firefox/WebKit remain at 0 failures (no regressions)
- Root cause analysis documented for each category
- Common timing pattern identified and fix applied consistently
- Git commit:
fix(e2e): resolve Chromium timing issues in user lifecycle, workflows, and data consistency
Phase 3: Medium-Impact E2E (PRIORITY 2)
Overview
Total Failures: 6 (2 + 2 + 2) Categories: User Management (2) + Modal Dropdowns (2) + Certificates (2) Impact: MEDIUM - User Workflows, Certificate Display Owner: Playwright Dev + Frontend Dev Status: 🔴 Not Started
Task 3.1: Settings - User Management (2 failures)
File: tests/settings/user-management.spec.ts
Browser: Chromium only
Impact: MEDIUM - User Invitation Workflows
Owner: Playwright Dev
Estimated Time: 1 hour
Failing Tests (2):
-
User should copy invite link
- Expected: Copy button copies invite URL to clipboard
- May need clipboard permission or different clipboard API in Chromium
-
User should remove permitted hosts
- Expected: Remove host from user permissions → host no longer accessible
- May need explicit wait for permission update
Investigation:
npx playwright test tests/settings/user-management.spec.ts --project=chromium --grep "copy invite link|remove permitted hosts"
Likely Fix: Clipboard API may differ in Chromium:
// Use Playwright's clipboard API instead of browser's
const clipboardText = await page.evaluate(() => navigator.clipboard.readText());
// Or grant clipboard permission explicitly
await context.grantPermissions(['clipboard-read', 'clipboard-write']);
Validation Command:
npx playwright test tests/settings/user-management.spec.ts --project=chromium --grep "copy invite link|remove permitted hosts"
Acceptance Criteria:
- Both tests pass in Chromium
- Clipboard operations work without manual permission grant
- No regressions in Firefox/WebKit
Task 3.2: Modal - Dropdown Triage (2 failures)
File: tests/modal-dropdown-triage.spec.ts
Browser: Chromium only
Impact: MEDIUM - User Workflows (Invite, Proxy Creation)
Owner: Frontend Dev
Estimated Time: 1 hour
Failing Tests (2):
-
InviteUserModal Role Dropdown
- Expected: Role dropdown opens and allows selection
- May need role-based locator fix from DNS provider work
-
ProxyHostForm ACL Dropdown
- Expected: ACL dropdown opens and allows selection
- May need role-based locator fix from DNS provider work
Known Issue: This is part of the dropdown triage effort completed for DNS providers. Same fix pattern should apply.
Investigation:
npx playwright test tests/modal-dropdown-triage.spec.ts --project=chromium
Likely Fix: Apply role-based locators:
// Before (brittle)
await page.locator('#role-dropdown').click();
// After (robust)
await page.getByRole('combobox', { name: 'Role' }).click();
await page.getByRole('option', { name: 'admin' }).click();
Validation Command:
npx playwright test tests/modal-dropdown-triage.spec.ts --project=chromium
Acceptance Criteria:
- Both dropdown tests pass in Chromium
- Locators use
getByRole('combobox')instead of CSS selectors - No regressions in Firefox/WebKit
Task 3.3: Core - Certificates SSL (2 failures)
File: tests/core/certificates.spec.ts
Browser: Chromium only
Impact: MEDIUM - Certificate Visibility
Owner: Playwright Dev
Estimated Time: 1 hour
Failing Tests (2):
-
Display certificate domain in table
- Expected: Certificate list shows domain name column
- May need explicit wait for table render in Chromium
-
Display certificate issuer
- Expected: Certificate list shows issuer column (Let's Encrypt, etc.)
- May need explicit wait for API data to populate columns
Investigation:
npx playwright test tests/core/certificates.spec.ts --project=chromium --grep "Display certificate"
Likely Fix: Add explicit wait for table data:
// Wait for certificate data API response
await page.waitForResponse(resp => resp.url().includes('/api/v1/certificates'));
// Wait for table to render
await page.locator('table tbody tr').first().waitFor({ state: 'visible' });
// Then assert column presence
await expect(page.locator('th:has-text("Domain")')).toBeVisible();
Validation Command:
npx playwright test tests/core/certificates.spec.ts --project=chromium --grep "Display certificate"
Acceptance Criteria:
- Both certificate display tests pass in Chromium
- Table columns render correctly after API data loads
- No regressions in Firefox/WebKit
Phase 3 Summary
Total Tasks: 3 (covering 6 test failures) Total Estimated Time: 3 hours Critical Path: All tasks can run in parallel
Phase 3 Validation Command:
# Run all medium-impact tests
npx playwright test tests/settings/user-management.spec.ts --project=chromium --grep "copy invite link|remove permitted hosts"
npx playwright test tests/modal-dropdown-triage.spec.ts --project=chromium
npx playwright test tests/core/certificates.spec.ts --project=chromium --grep "Display certificate"
# Expected: 6/6 tests passing
Phase 3 Exit Criteria:
- All 6 medium-impact tests passing in Chromium
- Firefox/WebKit remain at 0 failures
- Dropdown locators use robust role-based selectors
- Git commit:
fix(e2e): resolve user management, dropdown, and certificate display issues
Phase 4: Low-Impact E2E (PRIORITY 3)
Overview
Total Failures: 5 (2 + 2 + 1) Categories: Authentication (2) + Admin Onboarding (2) + Navigation (1) Impact: LOW - Edge Cases, Mobile UI Owner: Playwright Dev Status: 🔴 Not Started
Task 4.1: Core - Authentication (2 failures)
File: tests/core/authentication.spec.ts
Browser: Chromium only
Impact: LOW - Error Handling Edge Cases
Owner: Playwright Dev
Estimated Time: 1 hour
Failing Tests (2):
-
Redirect with error message and redirect to login page
- Expected: Invalid session → error message → redirect to login
- May need explicit wait for redirect or error message element
-
Force login when session expires
- Expected: Expired session → forced logout → redirect to login
- May need explicit wait for session expiration check
Investigation:
npx playwright test tests/core/authentication.spec.ts --project=chromium --grep "Redirect with error|Force login"
Validation Command:
npx playwright test tests/core/authentication.spec.ts --project=chromium --grep "Redirect with error|Force login"
Acceptance Criteria:
- Both authentication edge case tests pass
- No regressions in Firefox/WebKit
Task 4.2: Core - Admin Onboarding (2 failures)
File: tests/core/admin-onboarding.spec.ts
Browser: Chromium only
Impact: LOW - First-time Setup Workflow
Owner: Playwright Dev
Estimated Time: 1 hour
Failing Tests (2):
-
Setup Logout clears session
- Expected: First-time admin setup → logout → session cleared
- May need explicit wait for session clear
-
First login after logout successful
- Expected: Setup → logout → login again → successful
- May need explicit wait for login redirect after logout
Investigation:
npx playwright test tests/core/admin-onboarding.spec.ts --project=chromium --grep "Setup Logout|First login after logout"
Validation Command:
npx playwright test tests/core/admin-onboarding.spec.ts --project=chromium --grep "Setup Logout|First login after logout"
Acceptance Criteria:
- Both admin onboarding tests pass
- Session management correct during first-time setup
- No regressions in Firefox/WebKit
Task 4.3: Core - Navigation (1 failure)
File: tests/core/navigation.spec.ts
Browser: Chromium only
Impact: LOW - Mobile UI Interaction
Owner: Playwright Dev
Estimated Time: 0.5 hours (30 minutes)
Failing Test (1):
- Responsive Navigation should toggle mobile menu
- Expected: Small viewport → hamburger menu → click → menu opens
- May need explicit viewport size or mobile emulation in Chromium
Investigation:
npx playwright test tests/core/navigation.spec.ts --project=chromium --grep "toggle mobile menu"
Likely Fix: Ensure viewport explicitly set for mobile:
await page.setViewportSize({ width: 375, height: 667 }); // iPhone SE
await page.getByRole('button', { name: 'Toggle menu' }).click();
await expect(page.locator('nav.mobile-menu')).toBeVisible();
Validation Command:
npx playwright test tests/core/navigation.spec.ts --project=chromium --grep "toggle mobile menu"
Acceptance Criteria:
- Mobile menu toggle test passes in Chromium
- Viewport size explicitly set for mobile tests
- No regressions in Firefox/WebKit
Phase 4 Summary
Total Tasks: 3 (covering 5 test failures) Total Estimated Time: 2.5 hours Critical Path: All tasks can run in parallel
Phase 4 Validation Command:
# Run all low-impact tests
npx playwright test tests/core/authentication.spec.ts --project=chromium --grep "Redirect with error|Force login"
npx playwright test tests/core/admin-onboarding.spec.ts --project=chromium --grep "Setup Logout|First login after logout"
npx playwright test tests/core/navigation.spec.ts --project=chromium --grep "toggle mobile menu"
# Expected: 5/5 tests passing
Phase 4 Exit Criteria:
- All 5 low-impact tests passing in Chromium
- Firefox/WebKit remain at 0 failures
- Authentication and onboarding edge cases handled
- Git commit:
fix(e2e): resolve authentication, onboarding, and navigation edge cases
Phase 5: Final Validation & CI Approval
Overview
Status: 🔴 Not Started Owner: QA Lead + CI/CD Engineer Estimated Time: 2-3 hours Prerequisite: Phases 1-4 complete with 0 failures
Pre-Merge Validation Checklist (MANDATORY)
1. E2E Playwright Tests
# Run full suite across all browsers
npx playwright test --project=firefox --project=chromium --project=webkit
Expected Result: 1624/1624 passing (100%)
Acceptance Criteria:
- Firefox: 0 failures (542/542 passing)
- Chromium: 0 failures (540/540 passing) - was 28 failures
- WebKit: 0 failures (542/542 passing)
- No test skips (
test.skip()= 0) - No test timeouts (all tests < 30s)
- Trace generated for any flaky tests
2. Backend Coverage
# Run backend tests with coverage
scripts/go-test-coverage.sh
Expected Result: ≥85% coverage with 100% patch coverage
Acceptance Criteria:
- Overall coverage ≥85%
- Patch coverage = 100% (all modified lines covered)
- No coverage regressions from previous run
- All Go unit tests passing
go test ./...exits with code 0
3. Frontend Coverage
# Run frontend tests with coverage
scripts/frontend-test-coverage.sh
Expected Result: ≥85% coverage with 100% patch coverage
Acceptance Criteria:
- Overall coverage ≥85%
- Patch coverage = 100% (all modified lines covered)
- No coverage regressions from previous run
- All Vitest unit tests passing
npm testexits with code 0
4. Type Safety
# TypeScript type checking
npm run type-check
Expected Result: 0 TypeScript errors
Acceptance Criteria:
tsc --noEmitexits with code 0- No
@ts-ignoreor@ts-expect-erroradded - All import paths resolve correctly
- No implicit
anytypes introduced
5. Pre-commit Hooks
# Run all pre-commit hooks
pre-commit run --all-files
Expected Result: All hooks passing
Acceptance Criteria:
- Linting (ESLint, golangci-lint) passes
- Formatting (Prettier, gofmt) passes
- Security scans pass (no new issues)
- GORM security scanner passes (manual stage)
- All hooks exit with code 0
6. Security Scans
Trivy Docker Image Scan:
.github/skills/scripts/skill-runner.sh security-scan-docker-image
Expected Result: 0 CRITICAL/HIGH vulnerabilities
CodeQL Scan:
.github/skills/scripts/skill-runner.sh security-scan-codeql
Expected Result: 0 alerts (Critical/High/Medium)
Acceptance Criteria:
- Trivy: 0 CRITICAL vulnerabilities
- Trivy: 0 HIGH vulnerabilities
- CodeQL Go: 0 alerts
- CodeQL JavaScript: 0 alerts
- SBOM generated and verified
- All security workflows pass in CI
7. CI Workflows (GitHub Actions)
Required Workflows:
- E2E Tests - All browsers passing
- Go Tests - Coverage ≥85%, patch 100%
- Frontend Tests - Coverage ≥85%, patch 100%
- Security Scans - Trivy + CodeQL clean
- Codecov - Patch coverage 100%
- Build - Docker image builds successfully
- Lint - All linters passing
Validation:
# Trigger all workflows by pushing to PR branch
git push origin fix/ci-remediation
# Monitor CI status at:
# https://github.com/<org>/<repo>/actions
Acceptance Criteria:
- All CI workflows show green checkmarks
- No workflow failures or cancellations
- Codecov comment shows patch coverage 100%
- No new security alerts introduced
- Build time < 15 minutes (performance check)
Phase 6: Fail & Skip Census (RESEARCH TRACKING)
Overview
Purpose: Create a deterministic inventory of all failures and skips from latest full rerun and map each to an owner and remediation path. Owner: QA Lead + Playwright Dev Status: 🔴 Not Started Estimated Time: 2-4 hours
Inputs (Latest Evidence)
- Full rerun command:
npx playwright test --project=firefox --project=chromium --project=webkit - Latest result snapshot:
- Passed:
1500 - Failed:
62 - Skipped:
50
- Passed:
- Phase 2 focused Chromium result:
- Passed:
17 - Failed:
1(tests/settings/user-lifecycle.spec.tsfull lifecycle test)
- Passed:
Task 6.1: Build Fail/Skip Ledger
Output File: docs/reports/e2e_fail_skip_ledger_2026-02-13.md
Progress: ✅ Ledger created and committed locally.
For each failing or skipped test, record:
- Project/browser (
chromium,firefox,webkit) - Test file + test title
- Failure/skip reason category
- Repro command
- Suspected root cause
- Owner (
Backend Dev,Frontend Dev,Playwright Dev,QA) - Priority (
P0,P1,P2)
Task 6.2: Categorize into Clusters
Minimum clusters to track:
- Auth/session stability (
auth-long-session,authentication, onboarding) - Locator strictness & selector ambiguity (
modal-dropdown-triage, long-running tasks) - Navigation/load reliability (
navigation, account settings) - Data/empty-state assertions (
certificates, list rendering) - Browser-engine specific flakiness (
webkit internal error, detached elements) - Skip debt (
test.skipor project-level skipped suites)
Progress: 🟡 Skip cause registry created: docs/reports/e2e_skip_registry_2026-02-13.md.
Task 6.3: Prioritized Queue
- Generate top 15 failing tests by impact/frequency.
- Mark blockers for release path separately.
- Identify tests safe for immediate stabilization vs requiring product/contract decisions.
Phase 6 Exit Criteria
- Ledger created and committed
- Every fail/skip mapped to an owner and priority
- Clusters documented with root-cause hypotheses
- Top-15 queue approved for Phase 7
Phase 7: Failure Cluster Remediation (EXECUTION TRACKING)
Overview
Purpose: Resolve failures by cluster, not by ad-hoc file edits, and prevent regression spread. Owner: Playwright Dev + Frontend Dev + Backend Dev Status: 🔴 Not Started Estimated Time: 8-16 hours
Execution Order
- P0 Auth/Session Cluster
- Target files:
tests/core/auth-long-session.spec.ts,tests/core/authentication.spec.ts,tests/core/admin-onboarding.spec.ts,tests/settings/user-lifecycle.spec.ts - First action: fix context/session API misuse and deterministic re-auth flow.
- Target files:
- P1 Locator/Modal Cluster
- Target files:
tests/modal-dropdown-triage.spec.ts,tests/tasks/long-running-operations.spec.ts, related UI forms - First action: replace broad strict-mode locators with role/name-scoped unique locators.
- Target files:
- P1 Navigation/Load Cluster
- Target files:
tests/core/navigation.spec.ts,tests/settings/account-settings.spec.ts,tests/integration/import-to-production.spec.ts - First action: enforce stable route-ready checks before assertions.
- Target files:
- P2 Data/Empty-State Cluster
- Target files:
tests/core/certificates.spec.ts - First action: align empty-state assertions to actual UI contract.
- Target files:
Validation Rule (Per Cluster)
- Run only affected files first.
- Then run browser matrix for those files (
chromium,firefox,webkit). - Then run nightly full rerun checkpoint.
Phase 7 Exit Criteria
- P0 cluster fully green in all browsers
- P1 clusters fully green in all browsers
- P2 cluster resolved or explicitly deferred with approved issue
- No new failures introduced in previously green files
Phase 8: Skip Debt Burn-down & Re-enable (TRACKING)
Overview
Purpose: Eliminate non-justified skipped tests and restore full execution coverage. Owner: QA Lead + Playwright Dev Status: 🔴 Not Started Estimated Time: 4-8 hours
Task 8.1: Enumerate Skip Sources
test.skipannotations- conditional skips by browser/env
- project-level skip patterns
- temporarily disabled suites
Task 8.2: Classify Skips
- Valid contractual skip (document reason and expiry)
- Technical debt skip (must remediate)
- Obsolete test (replace/remove via approved change)
Task 8.3: Re-enable Plan
For each technical-debt skip:
- define unblock task
- assign owner
- assign ETA
- define re-enable command
Phase 8 Exit Criteria
- Skip registry created (
docs/reports/e2e_skip_registry_2026-02-13.md) - All technical-debt skips have remediation tasks
- No silent skips remain in critical suites
- Critical-path suites run with zero skips
Phase 9: Final Re-baseline & CI Gate Freeze
Overview
Purpose: Produce a clean baseline proving remediation completion and freeze test gates for merge. Owner: QA Lead Status: 🔴 Not Started Estimated Time: 2-4 hours
Required Runs
npx playwright test --project=firefox --project=chromium --project=webkit
scripts/go-test-coverage.sh
scripts/frontend-test-coverage.sh
npm run type-check
pre-commit run --all-files
Gate Criteria
- E2E: 0 fails, 0 skips in required suites
- Coverage thresholds met + patch coverage 100%
- Typecheck/lint/security scans green
- CI workflows fully green on PR
Freeze Criteria
- No test-definition changes after baseline without QA approval
- New failures automatically routed to ledger process (Phase 6 template)
Success Criteria Summary
✅ All checkboxes above must be checked before PR approval
Numbers:
- E2E: 1624/1624 passing (100%) ← was 1592/1620 (98.3%)
- Backend: ≥85% coverage, 100% patch
- Frontend: ≥85% coverage, 100% patch
- Security: 0 CRITICAL/HIGH vulnerabilities
- CI: 7/7 workflows passing
Quality Gates:
- No test skips, no failures, no compromises
- No security vulnerabilities introduced
- No coverage regressions
- No type errors
- All linters passing
Ready to Merge:
- PR approved by 2+ reviewers
- All conversations resolved
- Branch up-to-date with main
- Squash commits with descriptive message
- Merge to main → Trigger release pipeline
Quick Reference: Test Commands by Category
Security Tests
# All security tests (Phase 1 validation)
npx playwright test tests/security/ tests/security-enforcement/ --project=chromium
# ACL enforcement only (Task 1.1 + 1.2)
npx playwright test tests/security-enforcement/acl-enforcement.spec.ts --project=chromium
# Broken imports only (Task 1.4)
npx playwright test tests/security-enforcement/zzz-caddy-imports/ --project=chromium
E2E Tests by Priority
# High-Impact (Phase 2 - 17 tests)
npx playwright test tests/core/settings-user-lifecycle.spec.ts --project=chromium
npx playwright test tests/core/multi-component-workflows.spec.ts --project=chromium
npx playwright test tests/core/data-consistency.spec.ts --project=chromium
# Medium-Impact (Phase 3 - 6 tests)
npx playwright test tests/settings/user-management.spec.ts --project=chromium --grep "copy invite link|remove permitted hosts"
npx playwright test tests/modal-dropdown-triage.spec.ts --project=chromium
npx playwright test tests/core/certificates.spec.ts --project=chromium --grep "Display certificate"
# Low-Impact (Phase 4 - 5 tests)
npx playwright test tests/core/authentication.spec.ts --project=chromium --grep "Redirect with error|Force login"
npx playwright test tests/core/admin-onboarding.spec.ts --project=chromium --grep "Setup Logout|First login after logout"
npx playwright test tests/core/navigation.spec.ts --project=chromium --grep "toggle mobile menu"
Debug Commands
# Headed mode (watch test in browser)
npx playwright test [test-file] --project=chromium --headed
# Debug mode (step through with inspector)
npx playwright test [test-file] --project=chromium --debug
# Generate trace (for later analysis)
npx playwright test [test-file] --project=chromium --trace on
# View trace file
npx playwright show-trace trace.zip
Full Validation (Phase 5)
# E2E all browsers
npx playwright test --project=firefox --project=chromium --project=webkit
# Backend coverage
scripts/go-test-coverage.sh
# Frontend coverage
scripts/frontend-test-coverage.sh
# Type check
npm run type-check
# Pre-commit
pre-commit run --all-files
# Security scans
.github/skills/scripts/skill-runner.sh security-scan-docker-image
.github/skills/scripts/skill-runner.sh security-scan-codeql
Delegation Matrix
| Phase | Task | Owner | Est. Time | Status | Dependencies |
|---|---|---|---|---|---|
| 1.1 | ACL Security Status API | Backend Dev | 2h | 🔴 Not Started | None |
| 1.2 | ACL Access Lists API | Backend Dev | 2h | 🔴 Not Started | None |
| 1.3 | ACL Test IP API (Optional) | Backend Dev | 1h | 🔴 Not Started | Task 1.2 |
| 1.4 | Fix Broken Import Paths | Frontend Dev | 0.5h | 🔴 Not Started | None |
| 2.1 | User Lifecycle Tests | Playwright Dev | 3h | 🔴 Not Started | Phase 1 Complete |
| 2.2 | Multi-Component Workflows | Playwright Dev | 2h | 🔴 Not Started | Phase 1 Complete |
| 2.3 | Data Consistency Tests | Playwright Dev | 2h | 🔴 Not Started | Phase 1 Complete |
| 3.1 | User Management Tests | Playwright Dev | 1h | 🔴 Not Started | Phase 2 Complete |
| 3.2 | Modal Dropdown Tests | Frontend Dev | 1h | 🔴 Not Started | Phase 2 Complete |
| 3.3 | Certificate Display Tests | Playwright Dev | 1h | 🔴 Not Started | Phase 2 Complete |
| 4.1 | Authentication Edge Cases | Playwright Dev | 1h | 🔴 Not Started | Phase 3 Complete |
| 4.2 | Admin Onboarding Tests | Playwright Dev | 1h | 🔴 Not Started | Phase 3 Complete |
| 4.3 | Navigation Mobile Test | Playwright Dev | 0.5h | 🔴 Not Started | Phase 3 Complete |
| 5.0 | Final Validation & CI | QA Lead | 2-3h | 🔴 Not Started | Phases 1-4 Complete |
| 6.0 | Fail & Skip Census | QA Lead + Playwright Dev | 2-4h | 🔴 Not Started | Full rerun evidence |
| 7.0 | Failure Cluster Remediation | Playwright/Frontend/Backend | 8-16h | 🔴 Not Started | Phase 6 Complete |
| 8.0 | Skip Debt Burn-down | QA Lead + Playwright Dev | 4-8h | 🔴 Not Started | Phase 7 Complete |
| 9.0 | Final Re-baseline Freeze | QA Lead | 2-4h | 🔴 Not Started | Phase 8 Complete |
Total Estimated Time: 37-68 hours Critical Path: Phase 1 → Phase 2 → Phase 3 → Phase 4 → Phase 5 → Phase 6 → Phase 7 → Phase 8 → Phase 9
Team Resource Allocation
Backend Dev (5.5 hours):
- Task 1.1: ACL Security Status API (2h)
- Task 1.2: ACL Access Lists API (2h)
- Task 1.3: ACL Test IP API (1h - optional)
- Task 1.4: Code review for frontend import fixes (0.5h)
Frontend Dev (1.5 hours):
- Task 1.4: Fix Broken Import Paths (0.5h)
- Task 3.2: Modal Dropdown Tests (1h)
Playwright Dev (11 hours):
- Task 2.1: User Lifecycle Tests (3h)
- Task 2.2: Multi-Component Workflows (2h)
- Task 2.3: Data Consistency Tests (2h)
- Task 3.1: User Management Tests (1h)
- Task 3.3: Certificate Display Tests (1h)
- Task 4.1: Authentication Edge Cases (1h)
- Task 4.2: Admin Onboarding Tests (1h)
- Task 4.3: Navigation Mobile Test (0.5h)
QA Lead (3 hours):
- Phase 5: Final Validation & CI (2-3h)
- Cross-browser testing validation (included above)
- CI workflow monitoring (included above)
Parallel Execution Strategy
Day 1-2: Phase 1 (Security Fixes)
- Backend Dev: Tasks 1.1 + 1.2 + 1.3 (parallel)
- Frontend Dev: Task 1.4 (parallel with backend)
- Blocker: Must complete before Phase 2 starts
Day 2-3: Phase 2 (High-Impact E2E)
- Playwright Dev: Tasks 2.1 + 2.2 + 2.3 (serial recommended for pattern identification)
- Blocker: Must complete before Phase 3 starts
Day 3-4: Phase 3 (Medium-Impact E2E)
- Playwright Dev: Task 3.1 + 3.3 (parallel)
- Frontend Dev: Task 3.2 (parallel)
- Blocker: Must complete before Phase 4 starts
Day 4: Phase 4 (Low-Impact E2E)
- Playwright Dev: Tasks 4.1 + 4.2 + 4.3 (serial or parallel)
Day 4-5: Phase 5 (Final Validation)
- QA Lead: Full validation suite
- All Devs: Fix any regressions discovered
Risk Assessment & Mitigation
| Risk | Severity | Likelihood | Mitigation Strategy | Contingency Plan |
|---|---|---|---|---|
| Phase 1 API changes break existing frontend | HIGH | MEDIUM | Verify frontend ACL UI (22 tests) still passes after API implementation | Rollback API, implement with feature flag |
| Chromium timing fixes cause Firefox/WebKit failures | HIGH | LOW | Run full test suite after each fix; validate no regressions | Revert timing changes, use browser-specific waits |
| Phase 2 fixes take longer than estimated | MEDIUM | HIGH | Start with Task 2.1 (highest impact); identify common pattern early | Extend timeline by 1-2 days, deprioritize Phase 4 |
| CI fails after all local tests pass | MEDIUM | MEDIUM | Test in CI environment before final merge; use CI timeout multipliers | Debug in CI logs, add CI-specific waits |
| New test failures introduced during fixes | MEDIUM | MEDIUM | Run full suite after each phase; use git bisect to identify regression | Revert breaking commit, apply fix more surgically |
| Phase 5 validation discovers edge cases | LOW | MEDIUM | Thorough testing at each phase; don't skip intermediate validation | Add Phase 6 for edge case fixes, extend timeline by 1 day |
| Team capacity insufficient for timeline | MEDIUM | LOW | Parallelize tasks where possible; prioritize critical path | Deprioritize Phase 4 (low-impact), focus on Phases 1-3 first |
Success Metrics & KPIs
Before Remediation (Baseline)
- E2E Pass Rate: 98.3% (1592/1620)
- Security Pass Rate: 94.2% (65/69)
- Chromium Failures: 28
- Firefox Failures: 0
- WebKit Failures: 0
- CI Status: 🔴 BLOCKED
After Remediation (Target)
- E2E Pass Rate: 100% (1624/1624) ← +32 passing
- Security Pass Rate: 100% (69/69) ← +4 passing
- Chromium Failures: 0 ← -28 failures
- Firefox Failures: 0 ← maintained
- WebKit Failures: 0 ← maintained
- CI Status: ✅ PASSING
Improvement Metrics
- Failure Reduction: 36 → 0 (100% reduction)
- Pass Rate Improvement: +1.7% (98.3% → 100%)
- Tests Fixed: 36 tests
- New Backend APIs: 2 endpoints
- Code Quality: 100% patch coverage maintained
Communication & Reporting
Daily Standup Updates (Required)
Format:
**CI Remediation Status - [Date]**
- Current Phase: [X]
- Tasks Completed Today: [List]
- Tests Fixed: [X/36]
- Blockers: [None / List]
- Next 24h Plan: [Tasks]
- ETA to Phase 5: [X days]
Phase Completion Reports (Required)
Format:
**Phase [X] Complete - [Date]**
✅ Tasks Completed: [List with times]
✅ Tests Fixed: [X]
✅ Pass Rate: [%]
⚠️ Issues Encountered: [None / List with resolutions]
📊 Time Actual vs Estimated: [Xh vs Yh]
➡️ Next Phase: [Name - Starting [Date]]
Final Report (Required at Phase 5)
Format:
**CI Remediation Complete - [Date]**
✅ All 36 failures resolved
✅ 100% E2E pass rate achieved
✅ CI unblocked - ready to release
📊 Total Time: [Xh] (Est: 21-31h)
📊 Tests Fixed Breakdown:
- Security: 8
- High-Impact E2E: 17
- Medium-Impact E2E: 6
- Low-Impact E2E: 5
🎉 Ready for PR merge and release!
Appendix: Related Documentation
Source Documents
- Security Test Suite Remediation Plan - 8 security issues
- E2E Baseline Fresh Run - 28 Chromium failures
Testing Documentation
- Testing Instructions - Test execution protocols
- Playwright TypeScript Instructions - Test writing guidelines
Architecture Documentation
- Architecture - System architecture overview
- Contributing - Development guidelines
Test Files Referenced
tests/security-enforcement/acl-enforcement.spec.ts- 4 API failurestests/security-enforcement/zzz-caddy-imports/*.spec.ts- 4 broken importstests/core/settings-user-lifecycle.spec.ts- 7 Chromium failurestests/core/multi-component-workflows.spec.ts- 5 Chromium failurestests/core/data-consistency.spec.ts- 5 Chromium failurestests/settings/user-management.spec.ts- 2 Chromium failurestests/modal-dropdown-triage.spec.ts- 2 Chromium failurestests/core/certificates.spec.ts- 2 Chromium failurestests/core/authentication.spec.ts- 2 Chromium failurestests/core/admin-onboarding.spec.ts- 2 Chromium failurestests/core/navigation.spec.ts- 1 Chromium failure
Version History
| Version | Date | Changes | Author |
|---|---|---|---|
| 1.0 | 2026-02-12 | Initial plan creation | GitHub Copilot (Planning Agent) |
| 1.1 | 2026-02-13 | Added Phases 6-9 for fail/skip research, remediation tracking, skip debt burn-down, and final gate freeze; refreshed latest rerun metrics | GitHub Copilot (Management) |
End of Master Plan