# E2E Test Failure Remediation Plan - Emergency Token ACL Bypass **Status**: ACTIVE - READY FOR IMPLEMENTATION **Priority**: ๐Ÿ”ด CRITICAL - CI Blocking **Created**: 2026-01-28 **CI Run**: [#53 (e892669)](https://github.com/Wikid82/Charon/actions/runs/21456401944) **Branch**: feature/beta-release **PR**: #574 (Merge pull request from renovate/feature/beta-release-we...) --- ## Executive Summary | Metric | Value | |--------|-------| | **Total Tests** | 362 (across 4 shards) | | **Passed** | 139 per shard (~556 total runs) | | **Failed** | 1 unique test (runs in each shard) | | **Skipped** | 22 per shard | | **Total Duration** | 8m 55s | ### Failure Summary | Test File | Test Name | Error | Category | |-----------|-----------|-------|----------| | `emergency-token.spec.ts:44` | Test 1: Emergency token bypasses ACL | Expected 403, Received 200 | ๐Ÿ”ด CRITICAL | ### Root Cause Identified **The test fails because the `beforeAll` hook enables ACL but does NOT re-enable Cerberus (the security master switch).** The emergency security reset (run in `global-setup.ts`) disables ALL security modules including `feature.cerberus.enabled`. When the test's `beforeAll` only enables `security.acl.enabled = true`, the Cerberus middleware short-circuits at line 162-165 because `IsEnabled()` returns `false`. ```go // cerberus.go:162-165 if !c.IsEnabled() { ctx.Next() // โ† Request passes through without ACL check return } ``` **Impact**: - ๐Ÿ”ด **CI pipeline blocked** - Cannot merge to feature/beta-release - ๐Ÿ”ด **1 E2E test fails** - emergency-token.spec.ts Test 1 - ๐ŸŸข **No production code issue** - Cerberus ACL logic is correct - ๐ŸŸข **Test issue only** - beforeAll hook missing Cerberus enable step **The Fix**: Update the test's `beforeAll` hook to enable `feature.cerberus.enabled = true` BEFORE enabling `security.acl.enabled = true`. **Complexity**: LOW - Single test file fix, ~15 minutes --- ## Phase 1: Critical Fix (๐Ÿ”ด BLOCKING) ### Failure Analysis #### Test: `emergency-token.spec.ts:44:3` - "Test 1: Emergency token bypasses ACL" **File:** [tests/security-enforcement/emergency-token.spec.ts](../../../tests/security-enforcement/emergency-token.spec.ts#L44) **Error Message:** ``` Error: expect(received).toBe(expected) // Object.is equality Expected: 403 Received: 200 52 | const blockedResponse = await unauthenticatedRequest.get('/api/v1/security/status'); 53 | await unauthenticatedRequest.dispose(); 54 | expect(blockedResponse.status()).toBe(403); ``` **Expected Behavior:** - When ACL is enabled, unauthenticated requests to `/api/v1/security/status` should return 403 **Actual Behavior:** - Request returns 200 (ACL check is bypassed) **Root Cause Chain:** 1. `global-setup.ts` calls `/api/v1/emergency/security-reset` 2. Emergency reset disables: `feature.cerberus.enabled`, `security.acl.enabled`, `security.waf.enabled`, `security.rate_limit.enabled`, `security.crowdsec.enabled` 3. Test's `beforeAll` only enables: `security.acl.enabled = true` 4. Cerberus middleware checks `IsEnabled()` which reads `feature.cerberus.enabled` (still false) 5. Cerberus returns early without checking ACL โ†’ request passes through **Issue Type:** Test Issue (incomplete setup) --- ## EARS Requirements ### REQ-001: Cerberus Master Switch Precondition **Priority:** ๐Ÿ”ด CRITICAL **EARS Notation:** > WHEN security test suite `beforeAll` hook enables any individual security module (ACL, WAF, rate limiting), > THE SYSTEM SHALL first ensure `feature.cerberus.enabled` is set to `true` before enabling the specific module. **Acceptance Criteria:** - [ ] Test's `beforeAll` enables `feature.cerberus.enabled = true` BEFORE `security.acl.enabled` - [ ] Wait for security propagation between setting changes - [ ] Verify Cerberus is active by checking `/api/v1/security/status` response --- ### REQ-002: Security Module Dependency Validation **Priority:** ๐ŸŸก MEDIUM **EARS Notation:** > WHILE the Cerberus master switch (`feature.cerberus.enabled`) is disabled, > THE SYSTEM SHALL ignore individual security module settings (ACL, WAF, rate limiting) and allow all requests through. **Documentation Note:** This is DOCUMENTED behavior, but tests must respect this precondition. --- ### REQ-003: ACL Blocking Verification **Priority:** ๐Ÿ”ด CRITICAL **EARS Notation:** > WHEN ACL is enabled AND Cerberus is enabled AND there are no active access lists AND the request is NOT from an authenticated admin, > THE SYSTEM SHALL return HTTP 403 with error message containing "Blocked by access control". **Verification:** ```go // cerberus.go:233-238 if activeCount == 0 { if isAdmin && !adminWhitelistConfigured { ctx.Next() return } ctx.AbortWithStatusJSON(http.StatusForbidden, gin.H{"error": "Blocked by access control list"}) return } ``` --- ## Implementation Plan ### Task 1: Fix Test `beforeAll` Hook (๐Ÿ”ด CRITICAL) **File:** [tests/security-enforcement/emergency-token.spec.ts](../../../tests/security-enforcement/emergency-token.spec.ts#L18-L40) **Current Code (Lines 18-40):** ```typescript test.beforeAll(async ({ request }) => { console.log('๐Ÿ”ง Setting up test suite: Ensuring ACL is enabled...'); const emergencyToken = process.env.CHARON_EMERGENCY_TOKEN; if (!emergencyToken) { throw new Error('CHARON_EMERGENCY_TOKEN not set - cannot configure test environment'); } // Use emergency token to enable ACL (bypasses any existing security) const enableResponse = await request.patch('/api/v1/settings', { data: { key: 'security.acl.enabled', value: 'true' }, headers: { 'X-Emergency-Token': emergencyToken, }, }); if (!enableResponse.ok()) { throw new Error(`Failed to enable ACL for test suite: ${enableResponse.status()}`); } // Wait for security propagation await new Promise(resolve => setTimeout(resolve, 2000)); console.log('โœ… ACL enabled for test suite'); }); ``` **Fixed Code:** ```typescript test.beforeAll(async ({ request }) => { console.log('๐Ÿ”ง Setting up test suite: Ensuring Cerberus and ACL are enabled...'); const emergencyToken = process.env.CHARON_EMERGENCY_TOKEN; if (!emergencyToken) { throw new Error('CHARON_EMERGENCY_TOKEN not set - cannot configure test environment'); } // CRITICAL: Must enable Cerberus master switch FIRST // The emergency reset disables feature.cerberus.enabled, which makes // all other security modules ineffective (IsEnabled() returns false). const cerberusResponse = await request.patch('/api/v1/settings', { data: { key: 'feature.cerberus.enabled', value: 'true' }, headers: { 'X-Emergency-Token': emergencyToken }, }); if (!cerberusResponse.ok()) { throw new Error(`Failed to enable Cerberus: ${cerberusResponse.status()}`); } console.log(' โœ“ Cerberus master switch enabled'); // Wait for Cerberus to activate await new Promise(resolve => setTimeout(resolve, 1000)); // Now enable ACL (will be effective since Cerberus is active) const aclResponse = await request.patch('/api/v1/settings', { data: { key: 'security.acl.enabled', value: 'true' }, headers: { 'X-Emergency-Token': emergencyToken }, }); if (!aclResponse.ok()) { throw new Error(`Failed to enable ACL: ${aclResponse.status()}`); } console.log(' โœ“ ACL enabled'); // Wait for security propagation (settings cache TTL is 60s, but changes are immediate) await new Promise(resolve => setTimeout(resolve, 2000)); // VALIDATION: Verify security is actually blocking before proceeding console.log(' ๐Ÿ” Verifying ACL is active...'); const statusResponse = await request.get('/api/v1/security/status'); if (statusResponse.ok()) { const status = await statusResponse.json(); if (!status.acl?.enabled) { throw new Error('ACL verification failed: ACL not reported as enabled in status'); } console.log(' โœ“ ACL verified as enabled'); } console.log('โœ… Cerberus and ACL enabled for test suite'); }); ``` **Estimated Time:** 15 minutes --- ### Task 2: Add `afterAll` Cleanup Hook **File:** [tests/security-enforcement/emergency-token.spec.ts](../../../tests/security-enforcement/emergency-token.spec.ts) **New Code (add after `beforeAll`):** ```typescript test.afterAll(async ({ request }) => { console.log('๐Ÿงน Cleaning up test suite: Resetting security state...'); const emergencyToken = process.env.CHARON_EMERGENCY_TOKEN; if (!emergencyToken) { console.warn('โš ๏ธ No emergency token for cleanup'); return; } // Reset to safe state for other tests const response = await request.post('/api/v1/emergency/security-reset', { headers: { 'X-Emergency-Token': emergencyToken }, }); if (response.ok()) { console.log('โœ… Security state reset for next test suite'); } else { console.warn(`โš ๏ธ Security reset returned: ${response.status()}`); } }); ``` **Estimated Time:** 5 minutes --- ## Dependency Diagram ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Global Setup โ”‚ โ”‚ (global-setup.ts โ†’ emergency security reset โ†’ ALL modules OFF) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Auth Setup โ”‚ โ”‚ (auth.setup.ts โ†’ authenticates test user) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Test Suite: beforeAll โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ STEP 1: enable feature.cerberus.enabled = true โ”‚ โ”‚ โ”‚ โ”‚ (Cerberus master switch - REQUIRED FIRST!) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ–ผ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ STEP 2: enable security.acl.enabled = true โ”‚ โ”‚ โ”‚ โ”‚ (Now effective because Cerberus is ON) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ–ผ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ STEP 3: Verify /api/v1/security/status shows ACL enabled โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Test 1 Execution โ”‚ โ”‚ โ€ข Unauthenticated request โ†’ should get 403 โ”‚ โ”‚ โ€ข Emergency token request โ†’ should get 200 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Test Suite: afterAll โ”‚ โ”‚ โ€ข Call emergency security reset โ”‚ โ”‚ โ€ข Restore clean state for next suite โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` --- ## Success Criteria ### Phase 1 Complete When: - [ ] `emergency-token.spec.ts` passes locally with `npx playwright test tests/security-enforcement/emergency-token.spec.ts` - [ ] All 4 CI shards pass the emergency token test - [ ] No regressions in other security enforcement tests - [ ] Test properly cleans up in `afterAll` ### Verification Commands: ```bash # Local verification npx playwright test tests/security-enforcement/emergency-token.spec.ts --project=chromium # Full security test suite npx playwright test tests/security-enforcement/ --project=chromium # View report after run npx playwright show-report ``` --- ## Estimated Total Remediation Time | Task | Time | |------|------| | Task 1: Fix beforeAll hook | 15 min | | Task 2: Add afterAll cleanup | 5 min | | Local testing & verification | 15 min | | **Total** | **35 min** | --- ## Related Files | File | Purpose | |------|---------| | [tests/security-enforcement/emergency-token.spec.ts](../../../tests/security-enforcement/emergency-token.spec.ts) | Failing test (FIX HERE) | | [tests/global-setup.ts](../../../tests/global-setup.ts) | Global setup with emergency reset | | [tests/fixtures/security.ts](../../../tests/fixtures/security.ts) | Security test helpers | | [backend/internal/cerberus/cerberus.go](../../../backend/internal/cerberus/cerberus.go) | Cerberus middleware | --- ## Appendix: Other Observations ### Skipped Test Analysis **File:** `emergency-reset.spec.ts:69` - "should rate limit after 5 attempts" This test is marked `.skip` in the source. The skip is intentional and documented: ```typescript // Rate limiting is covered in emergency-token.spec.ts (Test 2) test.skip('should rate limit after 5 attempts', ...) ``` **No action required** - this is expected behavior. ### CI Workflow Observations 1. **4-shard parallel execution** - Each shard runs the same failing test independently 2. **139 passing tests per shard** - Good overall health 3. **22 skipped tests** - Expected (tagged tests, conditional skips) 4. **Merge Test Reports failed** - Cascading failure from E2E test failure --- ## Remediation Status | Phase | Status | Assignee | ETA | |-------|--------|----------|-----| | Phase 1: Critical Fix | ๐ŸŸก Ready for Implementation | - | ~35 min | --- *Generated by Planning Agent on 2026-01-28*