Files

GitHub Actions f4ef79def3 chore: repo cleanup by archiving plans / reports

2026-02-19 16:34:10 +00:00

16 KiB

Raw Blame History

E2E Test Triage Plan

Cross-Browser Playwright Test Suite Analysis

Generated: February 3, 2026 Test Run Context: Post-Docker service updates Environment: E2E container rebuilt with latest code Browsers: Chromium, Firefox, WebKit

Executive Summary

This document provides a comprehensive triage plan for failing and skipped Playwright E2E tests that are NOT explicitly marked for skipping. The test suite contains 2,737 total tests with mixed results requiring systematic investigation.

CRITICAL FINDINGS (2026-02-03):

Root Cause Identified: Emergency reset (/emergency/security-reset) disables Cerberus framework
Design Intent: Cerberus ON for testing, modules OFF to avoid ACL blocking
Current Bug: Emergency reset disables feature.cerberus.enabled instead of just modules
Impact: Toggle buttons become disabled, 13 tests skip conditionally
Solution: Modify emergency reset to disable MODULES but keep feature.cerberus.enabled = true
Files to Modify: backend/internal/api/handlers/emergency_handler.go
Test Order: Global setup → Cerberus tests → Break glass test (LAST)

Key Findings:

Multiple conditionally-skipped tests (runtime decisions based on feature state)
Explicitly skipped tests (marked with test.skip() or test.describe.skip()) that should NOT be triaged
Tests dependent on Cerberus security being enabled
Tests dependent on CrowdSec running and configured

Testing Infrastructure:

✓ E2E container running and healthy
✓ Emergency server responding (port 2020)
✓ Application server responding (port 8080)
✗ CrowdSec NOT running (expected - integration tests only)
⚠️ Cerberus state unknown (emergency server has no settings endpoint)

Triage Categories

Category 1: Conditional Skips (Runtime Environment Dependent)

Priority: HIGH Root Cause: Emergency reset disables Cerberus framework, not just modules Impact: Toggle buttons become disabled, tests skip at runtime Status: ✅ SOLVED with Universal Admin Whitelist Bypass

Design Intent (Confirmed):

Cerberus should be ENABLED to test break glass feature
Security modules should be ENABLED for realistic testing
Tests should bypass security using admin whitelist (0.0.0.0/0)
Break glass test runs, then recovery test restores with bypass

Solution Implemented:

Break Glass Test (emergency-reset.spec.ts) - Tests emergency reset, disables Cerberus
Break Glass Recovery (zzzz-break-glass-recovery.spec.ts) - NEW TEST that:
- Sets admin_whitelist = "0.0.0.0/0" (universal bypass for ANY IP)
- Re-enables feature.cerberus.enabled = true
- Enables ALL security modules (ACL, WAF, Rate Limit, CrowdSec)
- Verifies full security stack is ON but bypassed
Security Teardown (security-teardown.setup.ts) - Verifies state (no longer modifies)
Browser Tests - Run with full security enabled, bypassed via whitelist

Why 0.0.0.0/0 is brilliant:

✅ Bypasses security for ANY IP address (CI-friendly, environment-agnostic)
✅ Tests the admin whitelist bypass feature itself
✅ More realistic testing (full security stack actually enabled)
✅ Simpler state management than selective module disabling
✅ Works in Docker, localhost, CI, anywhere

Files Modified:

tests/security-enforcement/zzzz-break-glass-recovery.spec.ts (NEW - recovery test)
tests/security-teardown.setup.ts (MODIFIED - now verification only)

Tests Affected:

Security Dashboard - Module Toggle Actions (Tests 77-81, 214)
- ACL toggle (Test 77)
- WAF toggle (Test 78)
- Rate Limiting toggle (Test 79)
- Persist state after reload (Test 80/214)
Security Dashboard - Navigation (Tests 81, 83-84)
- Navigate to CrowdSec config (Test 81/250)
- Navigate to WAF config (Test 83/309)
- Navigate to Rate Limiting config (Test 84/335)
Rate Limiting Configuration (Test 57/70)
- Toggle rate limiting on/off

Investigation Steps:

Verify Test Environment Configuration

# Check if Cerberus is enabled in test environment
curl http://localhost:2020/emergency/settings | jq '.feature.cerberus.enabled'

Review Emergency Server Reset Logic
- File: tests/global-setup.ts
- Check if security reset is disabling Cerberus completely
- Current behavior: Disables all security modules BUT may be disabling Cerberus framework itself
Determine Expected Behavior ✅ CONFIRMED
- ✅ Cerberus SHOULD be enabled during E2E tests (to test break glass)
- ✅ Security modules SHOULD be enabled for realistic testing
- ✅ Tests toggle modules on/off as needed (interactive testing)
- ✅ Universal admin whitelist (0.0.0.0/0) bypasses security for all IPs
Solution Implemented: ✅ COMPLETE
- Created Break Glass Recovery Test (tests/security-enforcement/zzzz-break-glass-recovery.spec.ts)
  - Step 1: Set admin_whitelist = "0.0.0.0/0" (universal bypass)
  - Step 2: Re-enable feature.cerberus.enabled = true
  - Step 3: Enable ALL security modules (ACL, WAF, Rate Limit, CrowdSec)
  - Step 4: Verify full security stack enabled with universal bypass
- Modified Security Teardown (tests/security-teardown.setup.ts)
  - Now verification-only (no longer modifies configuration)
  - Checks Cerberus ON, modules ON, whitelist = 0.0.0.0/0
  - Logs warnings if state is incorrect

Execution Order:

1. Global setup → auth.setup.ts
2. Security-tests project (sequential, workers: 1):
   - All enforcement tests (ACL, WAF, Rate Limit, etc.)
   - emergency-reset.spec.ts (break glass test)
   - zzz-admin-whitelist-blocking.spec.ts (tests blocking)
   - zzzz-break-glass-recovery.spec.ts (NEW - restores with bypass)
3. Security-teardown → verify state
4. Browser tests (chromium/firefox/webkit) → Run with full security bypassed

Category 2: CrowdSec Dependency Tests

Priority: MEDIUM Root Cause: Tests require CrowdSec to be fully running and configured Status: Explicitly skipped with test.describe.skip()

Tests Affected (Tests 42-53):

Banned IPs Data Operations (Tests 42-43)
- Show active decisions
- Display decision columns (IP, type, duration, reason)
Add Decision (Ban IP) (Tests 44-46)
- Add ban button
- Open ban modal
- Validate IP address format
Remove Decision (Unban) (Tests 47-48)
- Show unban action
- Confirm before unbanning
Filtering and Search (Tests 49-50)
- Search/filter input
- Filter decisions by type
Refresh and Sync (Test 51)
- Refresh button functionality
Navigation (Test 52)
- Navigate back to CrowdSec config
Accessibility (Test 53)
- Keyboard navigation

Investigation Steps:

Determine CrowdSec Test Strategy
- These tests are marked test.describe.skip() with comment "Requires CrowdSec Running"
- Is CrowdSec intended to run in E2E environment?
- Should these be integration tests instead?
Review CrowdSec Architecture
- File: backend/internal/security/crowdsec/
- Check if CrowdSec can be mocked for E2E tests
- Review CrowdSec initialization in Docker container
Fix Options:
- Option A: Keep skipped - move to integration tests
- Option B: Enable CrowdSec in E2E environment with test data
- Option C: Mock CrowdSec API responses for UI testing only
Decision Criteria:
- Keep Skipped If: CrowdSec requires external dependencies, takes long to start, or is resource-intensive
- Enable If: CrowdSec can run in lightweight mode for E2E testing
- Mock If: Only testing UI interactions, not actual CrowdSec functionality

Files to Review:

tests/security/crowdsec-decisions.spec.ts  # Skipped tests
.docker/docker-entrypoint.sh               # CrowdSec startup
backend/internal/security/crowdsec/        # Implementation
docs/implementation/CROWDSEC_*.md          # Architecture docs

Category 3: Explicitly Skipped Tests (NO TRIAGE NEEDED)

Priority: N/A (Intentionally Skipped) Action: Document skip reason, track in backlog

Tests in This Category:

Caddy Import - Session Restoration (Tests in caddy-import-gaps.spec.ts)
- Test 4.1: Show pending session banner
- Test 4.2: Restore review table with previous content
- Reason: Known functionality gaps pending implementation
Emergency Server Tests (Tests in emergency-server.spec.ts)
- Test 3: Emergency server bypasses main app security
- Test 4: Emergency server security reset works
- Reason: May be redundant with other emergency server tests

Recommendation:

Create GitHub issues for each explicitly skipped test
Link issues to implementation plans
Schedule for future sprint/milestone
No immediate triage needed

Triage Workflow

Phase 1: Data Collection (COMPLETE)

Run complete cross-browser test suite
Identify all failing and skipped tests
Categorize skips (explicit vs conditional)
Document test patterns and dependencies

Phase 2: Environment Analysis (NEXT STEPS)

Timeline: 1-2 hours

Analyze Emergency Server Reset

# Check current emergency reset behavior
npm run test:e2e:setup -- --grep "emergency reset"

# Review global setup logs
grep -r "Emergency reset" tests/global-setup.ts

Check Cerberus Configuration

# Inspect test environment settings
docker exec charon-e2e cat /config/settings.json | jq '.feature.cerberus'

# Check emergency server endpoints
curl http://localhost:2020/emergency/settings

Document Current State
- What is enabled/disabled in test environment?
- What SHOULD be enabled/disabled?
- What are the gaps between current and desired state?

Phase 3: Fix Planning (AFTER ANALYSIS)

Timeline: 2-4 hours

For each category, create detailed fix plan with:

Root cause
Proposed solution
Implementation estimate
Testing approach
Rollback plan

Phase 4: Implementation (PER FIX)

Timeline: Varies by fix

Implement fixes in priority order:
- HIGH priority first (Category 1 - Conditional Skips)
- MEDIUM priority second (Category 2 - CrowdSec)
- Document skip reasons (Category 3 - Explicit Skips)

Validation approach:

# Test specific category
npm run test:e2e -- tests/security/security-dashboard.spec.ts --project=chromium

# Verify fix across all browsers
npm run test:e2e:all -- tests/security/security-dashboard.spec.ts

# Full regression test
npm run test:e2e:all

Phase 5: Documentation (CONTINUOUS)

Timeline: Ongoing

Update test documentation with skip reasons
Add comments to conditionally-skipped tests explaining when they should run
Create decision log for each triage decision
Update CI/CD pipeline configuration if needed

Investigation Priorities

Immediate Actions (Hour 1)

COMPLETED: Created diagnostic script at scripts/diagnose-test-env.sh
```
./scripts/diagnose-test-env.sh
```
COMPLETED: Identified Root Cause
- Emergency server API has LIMITED endpoints:
  - GET /health (no auth)
  - POST /emergency/security-reset (with auth + token)
- NO /emergency/settings endpoint exists
- Cannot query Cerberus state via emergency server
- Must use main application API (http://localhost:8080/api/v1/security/config)
KEY FINDING: Emergency Reset Disables Cerberus ✅ CONFIRMED
- The /emergency/security-reset endpoint disables Cerberus framework itself
- This causes toggle buttons/configure buttons to become disabled
- Tests skip when toggle.isDisabled() returns true
- Design Intent: Cerberus ON + Modules OFF (safe testing, toggles work)
- Current Bug: Emergency reset disables Cerberus framework too
- Test Flow: Global setup → All Cerberus tests → Break glass test (LAST)

Short-Term Actions (Hours 2-4)

Decide on Cerberus enablement strategy for tests
Implement fix for Category 1 (Conditional Skips)
Run targeted test validation

Medium-Term Actions (This Week)

Evaluate CrowdSec testing strategy (Category 2)
Create GitHub issues for explicitly skipped tests (Category 3)
Update test documentation
Add CI/CD checks for skip patterns

Success Criteria

Definition of Done for Triage:

All conditionally-skipped tests have clear run conditions documented
Tests run successfully when conditions are met
Tests fail gracefully with clear skip messages when conditions not met
Decision documented for each explicitly-skipped test category
CI/CD pipeline updated to handle skip scenarios
Test coverage maintained or improved

Metrics to Track:

Before Triage: X tests skipped (conditional + explicit)
After Triage: Y tests skipped (explicit only) + Z tests passing
Target: Minimize conditional skips, maintain explicit skips with issues

Risk Assessment

High Risk:

Enabling Cerberus in tests - May cause cascade of failures if not properly configured
Modifying emergency reset logic - Could break other tests or test isolation

Medium Risk:

Changing test environment variables - May affect multiple test suites
Enabling CrowdSec - Resource intensive, may slow test execution

Low Risk:

Adding explicit skip annotations - No functional impact
Creating GitHub issues - Tracking only

Rollback Plan

If implementation causes regression:

Immediate Rollback:

git checkout HEAD^ -- tests/global-setup.ts
npm run e2e:all -- --project=chromium

Emergency Reset to Known Good State:

.github/skills/scripts/skill-runner.sh docker-rebuild-e2e
git stash
npm run e2e:all

Document Failure:
- Capture test output
- Document what went wrong
- Update triage plan with lessons learned

Next Steps

Run Diagnostic Script (created above)
Analyze Results - Fill in data collection gaps
Make Decision - Cerberus enablement strategy
Implement Fix - Start with Category 1
Validate - Run targeted tests
Iterate - Move to next category

Appendix A: Test Output Patterns

Pattern 1: Conditional Skip with Cerberus Check

const isDisabled = await toggle.isDisabled();
if (isDisabled) {
  test.info().annotations.push({
    type: 'skip-reason',
    description: 'Toggle is disabled because Cerberus security is not enabled',
  });
  test.skip();
  return;
}

Recommendation: Add feature flag check before test execution instead of during test.

Pattern 2: Explicit Skip with Description

test.describe.skip('Banned IPs Data Operations (Requires CrowdSec Running)', () => {
  // Tests here
});

Recommendation: Keep as-is, create tracking issue.

Appendix B: Useful Commands

Test Execution

# Run specific test file
npm run test:e2e -- tests/security/security-dashboard.spec.ts

# Run with debug output
DEBUG=pw:api npm run test:e2e -- tests/security/security-dashboard.spec.ts

# Run in headed mode
npm run test:e2e:headed -- tests/security/security-dashboard.spec.ts

# Run specific test by name
npm run test:e2e -- -g "should toggle ACL"

Environment Inspection

# Check container logs
docker logs charon-e2e --tail 100

# Check settings
docker exec charon-e2e cat /config/settings.json | jq '.'

# Check emergency server
curl http://localhost:2020/emergency/settings | jq '.'

# Force security reset
curl -X POST http://localhost:2020/emergency/security-reset \
  -H "X-Emergency-Token: $(cat .env | grep EMERGENCY_TOKEN | cut -d= -f2)"

Test Reporting

# View HTML report
npx playwright show-report

# Generate custom report
npx playwright test --reporter=html,json

Change Log

Date	Author	Changes
2026-02-03	GitHub Copilot	Initial triage plan created

16 KiB Raw Blame History

E2E Test Triage Plan

Cross-Browser Playwright Test Suite Analysis

Executive Summary

Triage Categories

Category 1: Conditional Skips (Runtime Environment Dependent)

Tests Affected:

Investigation Steps:

Category 2: CrowdSec Dependency Tests

Tests Affected (Tests 42-53):

Investigation Steps:

Category 3: Explicitly Skipped Tests (NO TRIAGE NEEDED)

Tests in This Category:

Recommendation:

Triage Workflow

Phase 1: Data Collection (COMPLETE)

Phase 2: Environment Analysis (NEXT STEPS)

Phase 3: Fix Planning (AFTER ANALYSIS)

Phase 4: Implementation (PER FIX)

Phase 5: Documentation (CONTINUOUS)

Investigation Priorities

Immediate Actions (Hour 1)

Short-Term Actions (Hours 2-4)

Medium-Term Actions (This Week)

Success Criteria

Definition of Done for Triage:

Metrics to Track:

Risk Assessment

High Risk:

Medium Risk:

Low Risk:

Rollback Plan

Next Steps

Appendix A: Test Output Patterns

Pattern 1: Conditional Skip with Cerberus Check

Pattern 2: Explicit Skip with Description

Appendix B: Useful Commands

Test Execution

Environment Inspection

Test Reporting

Change Log

References

16 KiB

Raw Blame History