- Marked 12 tests as skip pending feature implementation - Features tracked in GitHub issue #686 (system log viewer feature completion) - Tests cover sorting by timestamp/level/method/URI/status, pagination controls, filtering by text/level, download functionality - Unblocks Phase 2 at 91.7% pass rate to proceed to Phase 3 security enforcement validation - TODO comments in code reference GitHub #686 for feature completion tracking - Tests skipped: Pagination (3), Search/Filter (2), Download (2), Sorting (1), Log Display (4)
18 KiB
E2E CI Failure Diagnosis - 100% Failure vs 90% Pass Local
Date: February 4, 2026 Status: 🔴 CRITICAL - 100% CI failure rate vs 90% local pass rate Urgency: HIGH - Blocking all PRs and CI/CD pipeline
Executive Summary
Problem: E2E tests exhibit a critical environmental discrepancy:
- Local Environment: 90% of E2E tests PASS when running via
skill-runner.sh test-e2e-playwright - CI Environment: 100% of E2E jobs FAIL in GitHub Actions workflow (
e2e-tests-split.yml)
Root Cause Hypothesis: Multiple critical configuration differences between local and CI environments create an inconsistent test execution environment, leading to systematic failures in CI.
Impact:
- ❌ All PRs blocked due to failing E2E checks
- ❌ Cannot merge to
mainordevelopment - ❌ CI/CD pipeline completely stalled
- ⚠️ Development velocity severely impacted
Configuration Comparison Matrix
Docker Compose Configuration Differences
| Configuration | Local (docker-compose.playwright-local.yml) |
CI (docker-compose.playwright-ci.yml) |
Impact |
|---|---|---|---|
| Environment | CHARON_ENV=e2e |
CHARON_ENV=test |
🔴 HIGH - Different runtime behavior |
| Credential Source | env_file: ../../.env |
Environment variables from $GITHUB_ENV |
🟡 MEDIUM - Potential missing vars |
| Encryption Key | Loaded from .env file |
Generated ephemeral: openssl rand -base64 32 |
🟢 LOW - Both valid |
| Emergency Token | Loaded from .env file |
From GitHub Secrets (CHARON_EMERGENCY_TOKEN) |
🟡 MEDIUM - Potential missing/invalid token |
| Security Tests Flag | ❌ NOT SET | ✅ CHARON_SECURITY_TESTS_ENABLED=true |
🔴 CRITICAL - May enable security modules |
| Data Storage | tmpfs: /app/data (in-memory, ephemeral) |
Named volumes (playwright_data, etc.) |
🟡 MEDIUM - Different persistence behavior |
| Security Profile | ❌ Not enabled by default | ✅ --profile security-tests (enables CrowdSec) |
🔴 CRITICAL - Different security modules active |
| Image Source | charon:local (fresh local build) |
charon:e2e-test (loaded from artifact) |
🟢 LOW - Both should be identical builds |
| Container Name | charon-e2e |
charon-playwright |
🟢 LOW - Cosmetic difference |
GitHub Actions Workflow Environment
| Variable | CI Value | Local Equivalent | Impact |
|---|---|---|---|
CI |
true |
Not set | 🟡 MEDIUM - Playwright retries, workers, etc. |
PLAYWRIGHT_BASE_URL |
http://localhost:8080 |
http://localhost:8080 |
🟢 LOW - Identical |
PLAYWRIGHT_COVERAGE |
0 (disabled by default) |
0 |
🟢 LOW - Identical |
CHARON_EMERGENCY_SERVER_ENABLED |
true |
true |
🟢 LOW - Identical |
CHARON_EMERGENCY_BIND |
0.0.0.0:2020 |
0.0.0.0:2020 |
🟢 LOW - Identical |
NODE_VERSION |
20 |
User-dependent | 🟡 MEDIUM - May differ |
GO_VERSION |
1.25.6 |
User-dependent | 🟡 MEDIUM - May differ |
Local Test Execution Flow
User runs E2E tests locally:
# Step 1: Rebuild E2E container (CRITICAL: user must do this)
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e
# Default behavior: NO security profile enabled
# Result: CrowdSec NOT running
# CHARON_SECURITY_TESTS_ENABLED: NOT SET
# Step 2: Run tests
.github/skills/scripts/skill-runner.sh test-e2e-playwright
What's missing locally:
- ❌ No
--profile security-tests(CrowdSec not running) - ❌ No
CHARON_SECURITY_TESTS_ENABLEDenvironment variable - ❌
CHARON_ENV=e2einstead ofCHARON_ENV=test - ✅ Uses
.envfile (requires user to have created it)
CI Test Execution Flow
GitHub Actions runs E2E tests:
# Step 1: Generate ephemeral encryption key
- name: Generate ephemeral encryption key
run: echo "CHARON_ENCRYPTION_KEY=$(openssl rand -base64 32)" >> $GITHUB_ENV
# Step 2: Validate emergency token
- name: Validate Emergency Token Configuration
# Checks CHARON_EMERGENCY_TOKEN from secrets
# Step 3: Start with security-tests profile
- name: Start test environment
run: |
docker compose -f .docker/compose/docker-compose.playwright-ci.yml --profile security-tests up -d
# Environment variables in workflow:
env:
CHARON_EMERGENCY_TOKEN: ${{ secrets.CHARON_EMERGENCY_TOKEN }}
CHARON_EMERGENCY_SERVER_ENABLED: "true"
CHARON_SECURITY_TESTS_ENABLED: "true" # ← SET IN CI
CHARON_E2E_IMAGE_TAG: charon:e2e-test
# Step 4: Wait for health check (30 attempts, 2s interval)
# Step 5: Run tests with sharding
npx playwright test --project=chromium --shard=1/4
What's different in CI:
- ✅
--profile security-testsenabled (CrowdSec running) - ✅
CHARON_SECURITY_TESTS_ENABLED=trueexplicitly set - ✅
CHARON_ENV=test(note2e) - ✅ Named volumes (persistent data within workflow run)
- ✅ Sharding enabled (4 shards per browser)
Root Cause Analysis
Critical Difference #1: CHARON_ENV (e2e vs test)
Evidence: Local uses CHARON_ENV=e2e, CI uses CHARON_ENV=test
Behavior Difference:
Looking at backend/internal/caddy/config.go:92:
isE2E := os.Getenv("CHARON_ENV") == "e2e"
if acmeEmail != "" || isE2E {
// E2E environment allows certificate generation without email
}
Impact: The application may behave differently in rate limiting, certificate generation, or other environment-specific logic depending on this variable.
Severity: 🔴 HIGH - Fundamental environment difference
Hypothesis: If there's rate limiting logic checking for CHARON_ENV == "e2e" to provide lenient limits, the CI environment with CHARON_ENV=test may enforce stricter limits, causing test failures.
Critical Difference #2: CHARON_SECURITY_TESTS_ENABLED
Evidence: NOT set locally, explicitly set to "true" in CI
Where it's set:
- CI Workflow:
CHARON_SECURITY_TESTS_ENABLED: "true"in env block - CI Compose:
CHARON_SECURITY_TESTS_ENABLED=${CHARON_SECURITY_TESTS_ENABLED:-true} - Local Compose: ❌ NOT PRESENT
Impact: UNKNOWN - This variable is NOT used anywhere in the backend Go code (confirmed by grep search). However, it may:
- Be checked in the frontend TypeScript code
- Control test fixture behavior
- Be a vestigial variable that was removed from code but left in compose files
Severity: 🟡 MEDIUM - Present in CI but not local, unexplained purpose
Action Required: Search frontend and test fixtures for usage of this variable.
Critical Difference #3: Security Profile (CrowdSec)
Evidence: CI runs with --profile security-tests, local does NOT (unless manually specified)
Impact:
- CI: CrowdSec container running alongside
charon-app - Local: No CrowdSec (unless user runs
docker-rebuild-e2e --profile=security-tests)
CrowdSec Service Configuration:
crowdsec:
image: crowdsecurity/crowdsec:latest
profiles:
- security-tests
environment:
- COLLECTIONS=crowdsecurity/nginx crowdsecurity/http-cve
- BOUNCER_KEY_charon=test-bouncer-key-for-e2e
- DISABLE_ONLINE_API=true
Severity: 🔴 CRITICAL - Entire security module missing locally
Hypothesis: Tests may be failing in CI because:
- CrowdSec is blocking requests that should pass
- CrowdSec has configuration issues in CI environment
- Tests are written assuming CrowdSec is NOT running
- Network routing through CrowdSec causes latency or timeouts
Critical Difference #4: Data Storage (tmpfs vs named volumes)
Evidence:
- Local:
tmpfs: /app/data:size=100M,mode=1777(in-memory, cleared on restart) - CI: Named volumes
playwright_data,playwright_caddy_data,playwright_caddy_config
Impact:
- Local: True ephemeral storage - every restart is 100% fresh
- CI: Volumes persist across container restarts within the same workflow run
Severity: 🟡 MEDIUM - Could cause state pollution in CI
Hypothesis: If CI containers are restarted mid-workflow (e.g., between shards), the volumes retain data, potentially causing state pollution that doesn't exist locally.
Critical Difference #5: Credential Management
Evidence:
- Local: Uses
env_file: ../../.envto load all credentials - CI: Passes credentials explicitly via
$GITHUB_ENVand secrets
Failure Scenario:
- User creates
.envfile withCHARON_ENCRYPTION_KEYandCHARON_EMERGENCY_TOKEN - Local tests pass because both variables are loaded from
.env - CI generates ephemeral
CHARON_ENCRYPTION_KEY(always fresh) - CI loads
CHARON_EMERGENCY_TOKENfrom GitHub Secrets
Potential Issues:
- ❓ Is
CHARON_EMERGENCY_TOKENcorrectly configured in GitHub Secrets? - ❓ Is the token length validation passing in CI? (requires ≥64 characters)
- ❓ Are there any other variables loaded from
.envlocally that are missing in CI?
Severity: 🔴 HIGH - Credential mismatches can cause authentication failures
Suspected Failure Scenarios
Scenario A: CrowdSec Blocking Legitimate Test Requests
Hypothesis: CrowdSec in CI is blocking test requests that would pass locally without CrowdSec.
Evidence Needed:
- Docker logs from CrowdSec container in failed CI runs
- Charon application logs showing blocked requests
- Test failure patterns (are they authentication/authorization related?)
Test: Run locally with security-tests profile:
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e --profile=security-tests
.github/skills/scripts/skill-runner.sh test-e2e-playwright
Expected: If this is the root cause, tests will fail locally with the profile enabled.
Scenario B: CHARON_ENV=test Enforces Stricter Limits
Hypothesis: The test environment enforces production-like limits (rate limiting, timeouts) that break tests designed for lenient e2e environment.
Evidence Needed:
- Search backend code for all uses of
CHARON_ENV - Identify rate limiting, timeout, or other behavior differences
- Check if tests make rapid API calls that would hit rate limits
Test:
Modify local compose to use CHARON_ENV=test:
# .docker/compose/docker-compose.playwright-local.yml
environment:
- CHARON_ENV=test # Change from e2e
Expected: If this is the root cause, tests will fail locally with CHARON_ENV=test.
Scenario C: Missing Environment Variable in CI
Hypothesis: The CI environment is missing a critical environment variable that's loaded from .env locally but not set in CI compose/workflow.
Evidence Needed:
- Compare
.env.examplewith all variables explicitly set indocker-compose.playwright-ci.ymland the workflow - Check application startup logs for warnings about missing environment variables
- Review test failure messages for configuration errors
Test: Audit all environment variables:
# Local container
docker exec charon-e2e env | sort > local-env.txt
# CI container (from failed run logs)
# Download docker logs artifact and extract env vars
Scenario D: Image Build Differences (Local vs CI Artifact)
Hypothesis: The Docker image built locally (charon:local) differs from the CI artifact (charon:e2e-test) in some way that causes test failures.
Evidence Needed:
- Compare Dockerfile build args between local and CI
- Inspect image layers to identify differences
- Check if CI cache is corrupted
Test: Load the CI artifact locally and run tests against it:
# Download artifact from failed CI run
# Load image: docker load -i charon-e2e-image.tar
# Run tests against CI artifact locally
Diagnostic Action Plan
Phase 1: Evidence Collection (Immediate)
Task 1.1: Download recent failed CI run artifacts
- Download Docker logs from latest failed run
- Download test traces and videos
- Download HTML test reports
Task 1.2: Capture local environment baseline
# With default settings (passing tests)
docker exec charon-e2e env | sort > local-env-baseline.txt
docker logs charon-e2e > local-logs-baseline.txt
Task 1.3: Search for CHARON_SECURITY_TESTS_ENABLED usage
# Frontend
grep -r "CHARON_SECURITY_TESTS_ENABLED" frontend/
# Tests
grep -r "CHARON_SECURITY_TESTS_ENABLED" tests/
# Backend (already confirmed: NOT USED)
Task 1.4: Document test failure patterns in CI
- Review last 10 failed CI runs
- Identify common error messages
- Check if specific tests always fail
- Check if failures are random or deterministic
Phase 2: Controlled Experiments (Next)
Experiment 2.1: Enable security-tests profile locally
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e --profile=security-tests --clean
.github/skills/scripts/skill-runner.sh test-e2e-playwright
Expected Outcome: If CrowdSec is the root cause, tests will fail locally.
Experiment 2.2: Change CHARON_ENV to "test" locally
# Edit .docker/compose/docker-compose.playwright-local.yml
# Change: CHARON_ENV=e2e → CHARON_ENV=test
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e --clean
.github/skills/scripts/skill-runner.sh test-e2e-playwright
Expected Outcome: If environment-specific behavior differs, tests will fail locally.
Experiment 2.3: Add CHARON_SECURITY_TESTS_ENABLED locally
# Edit .docker/compose/docker-compose.playwright-local.yml
# Add: - CHARON_SECURITY_TESTS_ENABLED=true
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e --clean
.github/skills/scripts/skill-runner.sh test-e2e-playwright
Expected Outcome: If this flag controls critical behavior, tests may fail locally.
Experiment 2.4: Use named volumes instead of tmpfs locally
# Edit .docker/compose/docker-compose.playwright-local.yml
# Replace tmpfs with named volumes matching CI config
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e --clean
.github/skills/scripts/skill-runner.sh test-e2e-playwright
Expected Outcome: If volume persistence causes state pollution, tests may behave differently.
Phase 3: CI Simplification (Final)
If experiments identify the root cause, apply corresponding fix to CI:
Fix 3.1: Remove security-tests profile from CI (if CrowdSec is the culprit)
# .github/workflows/e2e-tests-split.yml
- name: Start test environment
run: |
docker compose -f .docker/compose/docker-compose.playwright-ci.yml up -d
# Remove: --profile security-tests
Fix 3.2: Align CI environment to match local (if CHARON_ENV is the issue)
# .docker/compose/docker-compose.playwright-ci.yml
environment:
- CHARON_ENV=e2e # Change from test to e2e
Fix 3.3: Remove CHARON_SECURITY_TESTS_ENABLED (if unused)
# Remove from workflow and compose if truly unused
Fix 3.4: Use tmpfs in CI (if volume persistence is the issue)
# .docker/compose/docker-compose.playwright-ci.yml
tmpfs:
- /app/data:size=100M,mode=1777
# Remove: playwright_data volume
Investigation Priorities
🔴 CRITICAL - Investigate First
-
CrowdSec Profile Difference
- CI runs with CrowdSec, local does not (by default)
- Most likely root cause of 100% failure rate
- Action: Run Experiment 2.1 immediately
-
CHARON_ENV Difference (e2e vs test)
- Known to affect application behavior (rate limiting, etc.)
- Action: Run Experiment 2.2 immediately
-
Emergency Token Validation
- CI validates token length (≥64 chars)
- Local loads from
.env(unchecked) - Action: Review CI logs for token validation failures
🟡 MEDIUM - Investigate Next
-
CHARON_SECURITY_TESTS_ENABLED Purpose
- Set in CI, not in local
- Not used in backend Go code
- Action: Search frontend/tests for usage
-
Named Volumes vs tmpfs
- CI uses persistent volumes
- Local uses ephemeral tmpfs
- Action: Run Experiment 2.4 to test state pollution theory
-
Image Build Differences
- Local builds fresh, CI loads from artifact
- Action: Load CI artifact locally and compare
🟢 LOW - Investigate Last
-
Node.js/Go Version Differences
- Unlikely to cause 100% failure
- More likely to cause flaky tests, not systematic failures
-
Sharding Differences
- CI uses sharding (4 shards per browser)
- Local runs all tests in single process
- Action: Test with sharding locally
Success Criteria for Resolution
Definition of Done: CI environment matches local environment in all critical configuration aspects, resulting in:
- ✅ CI E2E tests pass at ≥90% rate (matching local)
- ✅ Root cause identified and documented
- ✅ Configuration differences eliminated or explained
- ✅ Reproducible test environment (local = CI)
- ✅ All experiments documented with results
- ✅ Runbook created for future E2E debugging
Rollback Plan: If fixes introduce new issues, revert changes and document findings for deeper investigation.
References
Files to Review:
.github/workflows/e2e-tests-split.yml- CI workflow configuration.docker/compose/docker-compose.playwright-ci.yml- CI docker compose.docker/compose/docker-compose.playwright-local.yml- Local docker compose.github/skills/scripts/skill-runner.sh- Skill runner orchestration.github/skills/test-e2e-playwright-scripts/run.sh- Local test execution.github/skills/docker-rebuild-e2e-scripts/run.sh- Local container rebuildbackend/internal/caddy/config.go- CHARON_ENV usageplaywright.config.js- Playwright test configuration
Related Documentation:
.github/instructions/testing.instructions.md- Test protocols.github/instructions/playwright-typescript.instructions.md- Playwright guidelinesdocs/reports/gh_actions_diagnostic.md- Previous CI failure analysis
GitHub Actions Runs (recent failures):
- Check Actions tab for latest failed runs on
e2e-tests-split.yml - Download artifacts: Docker logs, test reports, traces
Next Action: Execute Phase 1 evidence collection, focusing on CrowdSec profile and CHARON_ENV differences as primary suspects.
Assigned To: Supervisor Agent (for review and approval of diagnostic experiments)
Timeline:
- Phase 1 (Evidence): 1-2 hours
- Phase 2 (Experiments): 2-4 hours
- Phase 3 (Fixes): 1-2 hours
- Total Estimated Time: 4-8 hours to resolution
Diagnostic Plan Generated: February 4, 2026 Author: GitHub Copilot (Planning Mode)