Files
Charon/docs/plans/archive/e2e-test-triage-plan.md
2026-02-19 16:34:10 +00:00

16 KiB

E2E Test Triage Plan

Cross-Browser Playwright Test Suite Analysis

Generated: February 3, 2026 Test Run Context: Post-Docker service updates Environment: E2E container rebuilt with latest code Browsers: Chromium, Firefox, WebKit


Executive Summary

This document provides a comprehensive triage plan for failing and skipped Playwright E2E tests that are NOT explicitly marked for skipping. The test suite contains 2,737 total tests with mixed results requiring systematic investigation.

CRITICAL FINDINGS (2026-02-03):

  • Root Cause Identified: Emergency reset (/emergency/security-reset) disables Cerberus framework
  • Design Intent: Cerberus ON for testing, modules OFF to avoid ACL blocking
  • Current Bug: Emergency reset disables feature.cerberus.enabled instead of just modules
  • Impact: Toggle buttons become disabled, 13 tests skip conditionally
  • Solution: Modify emergency reset to disable MODULES but keep feature.cerberus.enabled = true
  • Files to Modify: backend/internal/api/handlers/emergency_handler.go
  • Test Order: Global setup → Cerberus tests → Break glass test (LAST)

Key Findings:

  • Multiple conditionally-skipped tests (runtime decisions based on feature state)
  • Explicitly skipped tests (marked with test.skip() or test.describe.skip()) that should NOT be triaged
  • Tests dependent on Cerberus security being enabled
  • Tests dependent on CrowdSec running and configured

Testing Infrastructure:

  • ✓ E2E container running and healthy
  • ✓ Emergency server responding (port 2020)
  • ✓ Application server responding (port 8080)
  • ✗ CrowdSec NOT running (expected - integration tests only)
  • ⚠️ Cerberus state unknown (emergency server has no settings endpoint)

Triage Categories

Category 1: Conditional Skips (Runtime Environment Dependent)

Priority: HIGH Root Cause: Emergency reset disables Cerberus framework, not just modules Impact: Toggle buttons become disabled, tests skip at runtime Status: SOLVED with Universal Admin Whitelist Bypass

Design Intent (Confirmed):

  • Cerberus should be ENABLED to test break glass feature
  • Security modules should be ENABLED for realistic testing
  • Tests should bypass security using admin whitelist (0.0.0.0/0)
  • Break glass test runs, then recovery test restores with bypass

Solution Implemented:

  1. Break Glass Test (emergency-reset.spec.ts) - Tests emergency reset, disables Cerberus
  2. Break Glass Recovery (zzzz-break-glass-recovery.spec.ts) - NEW TEST that:
    • Sets admin_whitelist = "0.0.0.0/0" (universal bypass for ANY IP)
    • Re-enables feature.cerberus.enabled = true
    • Enables ALL security modules (ACL, WAF, Rate Limit, CrowdSec)
    • Verifies full security stack is ON but bypassed
  3. Security Teardown (security-teardown.setup.ts) - Verifies state (no longer modifies)
  4. Browser Tests - Run with full security enabled, bypassed via whitelist

Why 0.0.0.0/0 is brilliant:

  • Bypasses security for ANY IP address (CI-friendly, environment-agnostic)
  • Tests the admin whitelist bypass feature itself
  • More realistic testing (full security stack actually enabled)
  • Simpler state management than selective module disabling
  • Works in Docker, localhost, CI, anywhere

Files Modified:

  • tests/security-enforcement/zzzz-break-glass-recovery.spec.ts (NEW - recovery test)
  • tests/security-teardown.setup.ts (MODIFIED - now verification only)

Tests Affected:

  • Security Dashboard - Module Toggle Actions (Tests 77-81, 214)

    • ACL toggle (Test 77)
    • WAF toggle (Test 78)
    • Rate Limiting toggle (Test 79)
    • Persist state after reload (Test 80/214)
  • Security Dashboard - Navigation (Tests 81, 83-84)

    • Navigate to CrowdSec config (Test 81/250)
    • Navigate to WAF config (Test 83/309)
    • Navigate to Rate Limiting config (Test 84/335)
  • Rate Limiting Configuration (Test 57/70)

    • Toggle rate limiting on/off

Investigation Steps:

  1. Verify Test Environment Configuration

    # Check if Cerberus is enabled in test environment
    curl http://localhost:2020/emergency/settings | jq '.feature.cerberus.enabled'
    
  2. Review Emergency Server Reset Logic

    • File: tests/global-setup.ts
    • Check if security reset is disabling Cerberus completely
    • Current behavior: Disables all security modules BUT may be disabling Cerberus framework itself
  3. Determine Expected Behavior CONFIRMED

    • Cerberus SHOULD be enabled during E2E tests (to test break glass)
    • Security modules SHOULD be enabled for realistic testing
    • Tests toggle modules on/off as needed (interactive testing)
    • Universal admin whitelist (0.0.0.0/0) bypasses security for all IPs
  4. Solution Implemented: COMPLETE

    • Created Break Glass Recovery Test (tests/security-enforcement/zzzz-break-glass-recovery.spec.ts)
      • Step 1: Set admin_whitelist = "0.0.0.0/0" (universal bypass)
      • Step 2: Re-enable feature.cerberus.enabled = true
      • Step 3: Enable ALL security modules (ACL, WAF, Rate Limit, CrowdSec)
      • Step 4: Verify full security stack enabled with universal bypass
    • Modified Security Teardown (tests/security-teardown.setup.ts)
      • Now verification-only (no longer modifies configuration)
      • Checks Cerberus ON, modules ON, whitelist = 0.0.0.0/0
      • Logs warnings if state is incorrect
  5. Execution Order:

    1. Global setup → auth.setup.ts
    2. Security-tests project (sequential, workers: 1):
       - All enforcement tests (ACL, WAF, Rate Limit, etc.)
       - emergency-reset.spec.ts (break glass test)
       - zzz-admin-whitelist-blocking.spec.ts (tests blocking)
       - zzzz-break-glass-recovery.spec.ts (NEW - restores with bypass)
    3. Security-teardown → verify state
    4. Browser tests (chromium/firefox/webkit) → Run with full security bypassed
    

Category 2: CrowdSec Dependency Tests

Priority: MEDIUM Root Cause: Tests require CrowdSec to be fully running and configured Status: Explicitly skipped with test.describe.skip()

Tests Affected (Tests 42-53):

  • Banned IPs Data Operations (Tests 42-43)

    • Show active decisions
    • Display decision columns (IP, type, duration, reason)
  • Add Decision (Ban IP) (Tests 44-46)

    • Add ban button
    • Open ban modal
    • Validate IP address format
  • Remove Decision (Unban) (Tests 47-48)

    • Show unban action
    • Confirm before unbanning
  • Filtering and Search (Tests 49-50)

    • Search/filter input
    • Filter decisions by type
  • Refresh and Sync (Test 51)

    • Refresh button functionality
  • Navigation (Test 52)

    • Navigate back to CrowdSec config
  • Accessibility (Test 53)

    • Keyboard navigation

Investigation Steps:

  1. Determine CrowdSec Test Strategy

    • These tests are marked test.describe.skip() with comment "Requires CrowdSec Running"
    • Is CrowdSec intended to run in E2E environment?
    • Should these be integration tests instead?
  2. Review CrowdSec Architecture

    • File: backend/internal/security/crowdsec/
    • Check if CrowdSec can be mocked for E2E tests
    • Review CrowdSec initialization in Docker container
  3. Fix Options:

    • Option A: Keep skipped - move to integration tests
    • Option B: Enable CrowdSec in E2E environment with test data
    • Option C: Mock CrowdSec API responses for UI testing only
  4. Decision Criteria:

    • Keep Skipped If: CrowdSec requires external dependencies, takes long to start, or is resource-intensive
    • Enable If: CrowdSec can run in lightweight mode for E2E testing
    • Mock If: Only testing UI interactions, not actual CrowdSec functionality
  5. Files to Review:

    tests/security/crowdsec-decisions.spec.ts  # Skipped tests
    .docker/docker-entrypoint.sh               # CrowdSec startup
    backend/internal/security/crowdsec/        # Implementation
    docs/implementation/CROWDSEC_*.md          # Architecture docs
    

Category 3: Explicitly Skipped Tests (NO TRIAGE NEEDED)

Priority: N/A (Intentionally Skipped) Action: Document skip reason, track in backlog

Tests in This Category:

  • Caddy Import - Session Restoration (Tests in caddy-import-gaps.spec.ts)

    • Test 4.1: Show pending session banner
    • Test 4.2: Restore review table with previous content
    • Reason: Known functionality gaps pending implementation
  • Emergency Server Tests (Tests in emergency-server.spec.ts)

    • Test 3: Emergency server bypasses main app security
    • Test 4: Emergency server security reset works
    • Reason: May be redundant with other emergency server tests

Recommendation:

  • Create GitHub issues for each explicitly skipped test
  • Link issues to implementation plans
  • Schedule for future sprint/milestone
  • No immediate triage needed

Triage Workflow

Phase 1: Data Collection (COMPLETE)

  • Run complete cross-browser test suite
  • Identify all failing and skipped tests
  • Categorize skips (explicit vs conditional)
  • Document test patterns and dependencies

Phase 2: Environment Analysis (NEXT STEPS)

Timeline: 1-2 hours

  1. Analyze Emergency Server Reset

    # Check current emergency reset behavior
    npm run test:e2e:setup -- --grep "emergency reset"
    
    # Review global setup logs
    grep -r "Emergency reset" tests/global-setup.ts
    
  2. Check Cerberus Configuration

    # Inspect test environment settings
    docker exec charon-e2e cat /config/settings.json | jq '.feature.cerberus'
    
    # Check emergency server endpoints
    curl http://localhost:2020/emergency/settings
    
  3. Document Current State

    • What is enabled/disabled in test environment?
    • What SHOULD be enabled/disabled?
    • What are the gaps between current and desired state?

Phase 3: Fix Planning (AFTER ANALYSIS)

Timeline: 2-4 hours

For each category, create detailed fix plan with:

  • Root cause
  • Proposed solution
  • Implementation estimate
  • Testing approach
  • Rollback plan

Phase 4: Implementation (PER FIX)

Timeline: Varies by fix

  1. Implement fixes in priority order:

    • HIGH priority first (Category 1 - Conditional Skips)
    • MEDIUM priority second (Category 2 - CrowdSec)
    • Document skip reasons (Category 3 - Explicit Skips)
  2. Validation approach:

    # Test specific category
    npm run test:e2e -- tests/security/security-dashboard.spec.ts --project=chromium
    
    # Verify fix across all browsers
    npm run test:e2e:all -- tests/security/security-dashboard.spec.ts
    
    # Full regression test
    npm run test:e2e:all
    

Phase 5: Documentation (CONTINUOUS)

Timeline: Ongoing

  • Update test documentation with skip reasons
  • Add comments to conditionally-skipped tests explaining when they should run
  • Create decision log for each triage decision
  • Update CI/CD pipeline configuration if needed

Investigation Priorities

Immediate Actions (Hour 1)

  1. COMPLETED: Created diagnostic script at scripts/diagnose-test-env.sh

    ./scripts/diagnose-test-env.sh
    
  2. COMPLETED: Identified Root Cause

    • Emergency server API has LIMITED endpoints:
      • GET /health (no auth)
      • POST /emergency/security-reset (with auth + token)
    • NO /emergency/settings endpoint exists
    • Cannot query Cerberus state via emergency server
    • Must use main application API (http://localhost:8080/api/v1/security/config)
  3. KEY FINDING: Emergency Reset Disables Cerberus CONFIRMED

    • The /emergency/security-reset endpoint disables Cerberus framework itself
    • This causes toggle buttons/configure buttons to become disabled
    • Tests skip when toggle.isDisabled() returns true
    • Design Intent: Cerberus ON + Modules OFF (safe testing, toggles work)
    • Current Bug: Emergency reset disables Cerberus framework too
    • Test Flow: Global setup → All Cerberus tests → Break glass test (LAST)

Short-Term Actions (Hours 2-4)

  1. Decide on Cerberus enablement strategy for tests
  2. Implement fix for Category 1 (Conditional Skips)
  3. Run targeted test validation

Medium-Term Actions (This Week)

  1. Evaluate CrowdSec testing strategy (Category 2)
  2. Create GitHub issues for explicitly skipped tests (Category 3)
  3. Update test documentation
  4. Add CI/CD checks for skip patterns

Success Criteria

Definition of Done for Triage:

  • All conditionally-skipped tests have clear run conditions documented
  • Tests run successfully when conditions are met
  • Tests fail gracefully with clear skip messages when conditions not met
  • Decision documented for each explicitly-skipped test category
  • CI/CD pipeline updated to handle skip scenarios
  • Test coverage maintained or improved

Metrics to Track:

  • Before Triage: X tests skipped (conditional + explicit)
  • After Triage: Y tests skipped (explicit only) + Z tests passing
  • Target: Minimize conditional skips, maintain explicit skips with issues

Risk Assessment

High Risk:

  • Enabling Cerberus in tests - May cause cascade of failures if not properly configured
  • Modifying emergency reset logic - Could break other tests or test isolation

Medium Risk:

  • Changing test environment variables - May affect multiple test suites
  • Enabling CrowdSec - Resource intensive, may slow test execution

Low Risk:

  • Adding explicit skip annotations - No functional impact
  • Creating GitHub issues - Tracking only

Rollback Plan

If implementation causes regression:

  1. Immediate Rollback:

    git checkout HEAD^ -- tests/global-setup.ts
    npm run e2e:all -- --project=chromium
    
  2. Emergency Reset to Known Good State:

    .github/skills/scripts/skill-runner.sh docker-rebuild-e2e
    git stash
    npm run e2e:all
    
  3. Document Failure:

    • Capture test output
    • Document what went wrong
    • Update triage plan with lessons learned

Next Steps

  1. Run Diagnostic Script (created above)
  2. Analyze Results - Fill in data collection gaps
  3. Make Decision - Cerberus enablement strategy
  4. Implement Fix - Start with Category 1
  5. Validate - Run targeted tests
  6. Iterate - Move to next category

Appendix A: Test Output Patterns

Pattern 1: Conditional Skip with Cerberus Check

const isDisabled = await toggle.isDisabled();
if (isDisabled) {
  test.info().annotations.push({
    type: 'skip-reason',
    description: 'Toggle is disabled because Cerberus security is not enabled',
  });
  test.skip();
  return;
}

Recommendation: Add feature flag check before test execution instead of during test.

Pattern 2: Explicit Skip with Description

test.describe.skip('Banned IPs Data Operations (Requires CrowdSec Running)', () => {
  // Tests here
});

Recommendation: Keep as-is, create tracking issue.


Appendix B: Useful Commands

Test Execution

# Run specific test file
npm run test:e2e -- tests/security/security-dashboard.spec.ts

# Run with debug output
DEBUG=pw:api npm run test:e2e -- tests/security/security-dashboard.spec.ts

# Run in headed mode
npm run test:e2e:headed -- tests/security/security-dashboard.spec.ts

# Run specific test by name
npm run test:e2e -- -g "should toggle ACL"

Environment Inspection

# Check container logs
docker logs charon-e2e --tail 100

# Check settings
docker exec charon-e2e cat /config/settings.json | jq '.'

# Check emergency server
curl http://localhost:2020/emergency/settings | jq '.'

# Force security reset
curl -X POST http://localhost:2020/emergency/security-reset \
  -H "X-Emergency-Token: $(cat .env | grep EMERGENCY_TOKEN | cut -d= -f2)"

Test Reporting

# View HTML report
npx playwright show-report

# Generate custom report
npx playwright test --reporter=html,json

Change Log

Date Author Changes
2026-02-03 GitHub Copilot Initial triage plan created

References