- Added clarity and structure to README files, including recent updates and getting started sections. - Improved manual verification documentation for CrowdSec authentication, emphasizing expected outputs and success criteria. - Updated debugging guide with detailed output examples and automatic trace capture information. - Refined best practices for E2E tests, focusing on efficient polling, locator strategies, and state management. - Documented triage report for DNS Provider feature tests, highlighting issues fixed and test results before and after improvements. - Revised E2E test writing guide to include when to use specific helper functions and patterns for better test reliability. - Enhanced troubleshooting documentation with clear resolutions for common issues, including timeout and token configuration problems. - Updated tests README to provide quick links and best practices for writing robust tests.
11 KiB
E2E Test Troubleshooting
Common issues and solutions for Playwright E2E tests.
Recent Improvements (2026-02)
Test Timeout Issues - RESOLVED
Symptoms: Tests timing out after 30 seconds, config reload overlay blocking interactions
Resolution:
- Extended timeout from 30s to 60s for feature flag propagation
- Added automatic detection and waiting for config reload overlay
- Improved test isolation with proper cleanup in afterEach hooks
If you still experience timeouts:
- Rebuild the E2E container:
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e - Check Docker logs for health check failures
- Verify emergency token is set in
.envfile
API Key Format Mismatch - RESOLVED
Symptoms: Feature flag tests failing with propagation timeout
Resolution:
- Added key normalization to handle both
feature.cerberus.enabledandcerberus.enabledformats - Tests now automatically detect and adapt to API response format
Configuration: No manual configuration needed, normalization is automatic.
Quick Diagnostics
Run these commands first:
# Check emergency token is set
grep CHARON_EMERGENCY_TOKEN .env
# Verify token length
echo -n "$(grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2)" | wc -c
# Should output: 64
# Check Docker container is running
docker ps | grep charon
# Check health endpoint
curl -f http://localhost:8080/api/v1/health || echo "Health check failed"
Error: "CHARON_EMERGENCY_TOKEN is not set"
Symptoms
- Tests fail immediately with environment configuration error
- Error appears in global setup before any tests run
Cause
Emergency token not configured in .env file.
Solution
-
Generate token:
openssl rand -hex 32 -
Add to
.envfile:echo "CHARON_EMERGENCY_TOKEN=<paste_token_here>" >> .env -
Verify:
grep CHARON_EMERGENCY_TOKEN .env -
Run tests:
npx playwright test --project=chromium
📖 More Info: See Getting Started - Emergency Token Configuration
Error: "CHARON_EMERGENCY_TOKEN is too short"
Symptoms
- Global setup fails with message about token length
- Current token length shown in error (e.g., "32 chars, minimum 64")
Cause
Token is shorter than 64 characters (security requirement).
Solution
-
Regenerate token with correct length:
openssl rand -hex 32 # Generates 64-char hex string -
Update
.envfile:sed -i "s/CHARON_EMERGENCY_TOKEN=.*/CHARON_EMERGENCY_TOKEN=<new_token>/" .env -
Verify length:
echo -n "$(grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2)" | wc -c # Should output: 64
Error: "Failed to reset security modules using emergency token"
Symptoms
- Security teardown fails
- Causes 20+ cascading test failures
- Error message about emergency reset
Possible Causes
- Token too short (< 64 chars)
- Token doesn't match backend configuration
- Backend not running or unreachable
- Network/container issues
Solution
Step 1: Verify token configuration
# Check token exists and is 64 chars
echo -n "$(grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2)" | wc -c
# Check backend env matches (if using Docker)
docker exec charon env | grep CHARON_EMERGENCY_TOKEN
Step 2: Verify backend is running
curl http://localhost:8080/api/v1/health
# Should return: {"status":"ok"}
Step 3: Test emergency endpoint directly
curl -X POST http://localhost:8080/api/v1/emergency/security-reset \
-H "X-Emergency-Token: $(grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2)" \
-H "Content-Type: application/json" \
-d '{"reason":"manual test"}' | jq
Step 4: Check backend logs
# Docker Compose
docker compose logs charon | tail -50
# Docker Run
docker logs charon | tail -50
Step 5: Regenerate token if needed
# Generate new token
NEW_TOKEN=$(openssl rand -hex 32)
# Update .env
sed -i "s/CHARON_EMERGENCY_TOKEN=.*/CHARON_EMERGENCY_TOKEN=${NEW_TOKEN}/" .env
# Restart backend with new token
docker restart charon
# Wait for health
sleep 5 && curl http://localhost:8080/api/v1/health
Error: "Blocked by access control list" (403)
Symptoms
- Most tests fail with 403 Forbidden errors
- Error message contains "Blocked by access control"
Cause
Security teardown did not successfully disable ACL before tests ran.
Solution
-
Run teardown script manually:
npx playwright test tests/security-teardown.setup.ts -
Check teardown output for errors:
- Look for "Emergency reset successful" message
- Verify no error messages about missing token
-
Verify ACL is disabled:
curl http://localhost:8080/api/v1/security/status | jq # acl.enabled should be false -
If still blocked, manually disable via API:
# Using emergency token curl -X POST http://localhost:8080/api/v1/emergency/security-reset \ -H "X-Emergency-Token: $(grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2)" \ -H "Content-Type: application/json" \ -d '{"reason":"manual disable before tests"}' -
Run tests again:
npx playwright test --project=chromium
Tests Pass Locally but Fail in CI/CD
Symptoms
- Tests work on your machine
- Same tests fail in GitHub Actions
- Error about missing emergency token in CI logs
Cause
CHARON_EMERGENCY_TOKEN not configured in GitHub Secrets.
Solution
-
Navigate to repository settings:
- Go to:
https://github.com/<your-org>/<your-repo>/settings/secrets/actions - Or: Repository → Settings → Secrets and Variables → Actions
- Go to:
-
Create secret:
- Click "New repository secret"
- Name:
CHARON_EMERGENCY_TOKEN - Value: Generate with
openssl rand -hex 32 - Click "Add secret"
-
Verify secret is set:
- Secret should appear in list (value is masked)
- Cannot view value after creation (security)
-
Re-run workflow:
- Navigate to Actions tab
- Re-run failed workflow
- Check "Validate Emergency Token Configuration" step passes
📖 Detailed Instructions: See GitHub Setup Guide
Error: "ECONNREFUSED" or "ENOTFOUND"
Symptoms
- Tests fail with connection refused errors
- Cannot reach
localhost:8080or configured base URL
Cause
Backend container not running or not accessible.
Solution
-
Check container status:
docker ps | grep charon -
If not running, start it:
# Docker Compose docker compose up -d # Docker Run docker start charon -
Wait for health:
timeout 60 bash -c 'until curl -f http://localhost:8080/api/v1/health; do sleep 2; done' -
Check logs if still failing:
docker logs charon | tail -50
Error: Token appears to be a placeholder value
Symptoms
- Global setup validation fails
- Error mentions "placeholder value"
Cause
Token contains common placeholder strings like:
test-emergency-tokenyour_64_characterreplace_this0000000000000000
Solution
-
Generate a unique token:
openssl rand -hex 32 -
Replace placeholder in
.env:sed -i "s/CHARON_EMERGENCY_TOKEN=.*/CHARON_EMERGENCY_TOKEN=<new_token>/" .env -
Verify it's not a placeholder:
grep CHARON_EMERGENCY_TOKEN .env # Should show a random hex string
Debug Mode
Run tests with full debugging for deeper investigation:
With Playwright Inspector
npx playwright test --debug
Interactive UI for stepping through tests.
With Full Traces
npx playwright test --trace=on
Capture execution traces for each test.
View Trace After Test
npx playwright show-trace test-results/traces/*.zip
Opens trace viewer in browser.
With Enhanced Logging
DEBUG=charon:*,charon-test:* PLAYWRIGHT_DEBUG=1 npx playwright test --project=chromium
Enables all debug output.
Performance Issues
Tests Running Slowly
Symptoms: Tests take > 5 minutes for full suite.
Solutions:
-
Use sharding (parallel execution):
npx playwright test --shard=1/4 --project=chromium -
Run specific test files:
npx playwright test tests/manual-dns-provider.spec.ts -
Skip slow tests during development:
npx playwright test --grep-invert "@slow"
Feature Flag Toggle Tests Timing Out
Symptoms:
- Tests in
tests/settings/system-settings.spec.tsfail with timeout errors - Error messages mention feature flag toggles (Cerberus, CrowdSec, Uptime, Persist)
Cause:
- Backend N+1 query pattern causing 300-600ms latency in CI
- Hard-coded waits insufficient for slower CI environments
Solution (Fixed in v2.x):
- Backend now uses batch query pattern (3-6x faster: 600ms → 200ms P99)
- Tests use condition-based polling with
waitForFeatureFlagPropagation() - Retry logic with exponential backoff handles transient failures
If you still experience issues:
- Check backend latency:
grep "[METRICS]" docker logs charon - Verify batch query is being used (should see
WHERE key IN (...)in logs) - Ensure you're running latest version with the optimization
📖 See Also: Feature Flags Performance Documentation
Container Startup Slow
Symptoms: Health check timeouts, tests fail before running.
Solutions:
-
Increase health check timeout:
timeout 120 bash -c 'until curl -f http://localhost:8080/api/v1/health; do sleep 2; done' -
Pre-pull Docker image:
docker pull wikid82/charon:latest -
Check Docker resource limits:
docker stats charon # Ensure adequate CPU/memory
Getting Help
If you're still stuck after trying these solutions:
-
Check known issues:
- Review E2E Triage Report
- Search GitHub Issues
-
Collect diagnostic info:
# Environment echo "OS: $(uname -a)" echo "Docker: $(docker --version)" echo "Node: $(node --version)" # Configuration echo "Base URL: ${PLAYWRIGHT_BASE_URL:-http://localhost:8080}" echo "Token set: $([ -n "$CHARON_EMERGENCY_TOKEN" ] && echo "Yes" || echo "No")" # Logs docker logs charon > charon-logs.txt npx playwright test --project=chromium > test-output.txt 2>&1 -
Open GitHub issue:
- Include diagnostic info above
- Attach
charon-logs.txtandtest-output.txt - Describe steps to reproduce
- Tag with
testingande2elabels
-
Ask in community:
- GitHub Discussions
- Include relevant error messages (mask any secrets!)
Related Documentation
- Getting Started Guide
- GitHub Setup Guide
- Feature Flags Performance Documentation
- E2E Triage Report
- Playwright Documentation
Last Updated: 2026-02-02