Files
Charon/docs/plans/e2e_remediation_spec.md
GitHub Actions 0da6f7620c fix: restore PATCH endpoints used by E2E + emergency-token fallback
register PATCH /api/v1/settings and PATCH /api/v1/security/acl (E2E expectations)
add emergency-token-aware shortcut handlers (validate X-Emergency-Token → set admin context → invoke handler)
preserve existing POST handlers and backward compatibility
rebuild & redeploy E2E image, verified backend build success
Why: unblocked failing Playwright E2E tests that returned 404s and were blocking the hotfix release
2026-01-27 22:43:33 +00:00

41 KiB

E2E Test Failures Remediation Specification

Document Version: 1.0 Created: 2026-01-27 Status: ACTIVE Priority: HIGH Estimated Completion Time: < 2 hours


Executive Summary

This specification addresses 21 E2E test failures identified in the E2E Triage Report. The root cause is a missing CHARON_EMERGENCY_TOKEN configuration causing security teardown failure, which cascades to 20 additional test failures. One standalone test has a design issue requiring refactoring.

Impact:

  • Current Test Success Rate: 73% (116/159 passed)
  • Target Test Success Rate: 99% (157/159 passed)
  • Blocking Severity: HIGH - Prevents security enforcement test suite execution

Resolution Strategy:

  1. Configure emergency token for local and CI/CD environments
  2. Fix error handling in security teardown script
  3. Refactor problematic test design
  4. Add preventive validation checks
  5. Update documentation

1. Requirements (EARS Notation)

1.1 Emergency Token Management

REQ-001: Emergency Token Generation

  • WHEN a developer sets up the local development environment, THE SYSTEM SHALL provide a mechanism to generate a cryptographically secure 64-character emergency token.

REQ-002: Emergency Token Storage

  • THE SYSTEM SHALL store the emergency token in the .env file with the key CHARON_EMERGENCY_TOKEN.

REQ-003: Emergency Token Validation

  • WHEN the test suite initializes, THE SYSTEM SHALL validate that CHARON_EMERGENCY_TOKEN is set and meets minimum length requirements (64 characters).

REQ-004: Emergency Token Security

  • THE SYSTEM SHALL NOT commit actual emergency token values to the repository.
  • WHERE .env.example is provided, THE SYSTEM SHALL include a placeholder with generation instructions.

REQ-005: CI/CD Token Availability

  • WHEN E2E tests run in CI/CD pipelines, THE SYSTEM SHALL ensure CHARON_EMERGENCY_TOKEN is available from environment variables or secrets.

1.2 Test Infrastructure Error Handling

REQ-006: Error Array Initialization

  • WHEN the security teardown script encounters errors, THE SYSTEM SHALL properly initialize the errors array before attempting to join elements.

REQ-007: Graceful Error Reporting

  • IF the emergency token is missing or invalid, THEN THE SYSTEM SHALL display a clear, actionable error message guiding the user to configure the token.

REQ-008: Fail-Fast Validation

  • WHEN critical configuration is missing, THE SYSTEM SHALL fail immediately with a descriptive error rather than allowing cascading test failures.

1.3 Test Design Quality

REQ-009: Emergency Token Test Setup

  • WHEN testing emergency token bypass functionality, THE SYSTEM SHALL use the emergency token endpoint for test data setup to avoid chicken-and-egg problems.

REQ-010: Test Isolation

  • WHEN security modules are enabled during tests, THE SYSTEM SHALL ensure test setup can execute without being blocked by the security mechanisms under test.

REQ-011: Error Code Coverage

  • WHEN tests validate error conditions, THE SYSTEM SHALL accept all valid error codes that may occur in the test environment (e.g., 403 from ACL in addition to 500/502/503 from service unavailability).

1.4 Documentation and Developer Experience

REQ-012: Setup Documentation

  • THE SYSTEM SHALL provide clear instructions in README.md and .env.example for emergency token configuration.

REQ-013: Troubleshooting Guide

  • THE SYSTEM SHALL document common E2E test failure scenarios and their resolutions in the troubleshooting documentation.

REQ-014: Pre-Test Validation

  • WHEN developers run E2E tests locally, THE SYSTEM SHALL validate required environment variables before test execution begins.

2. Technical Design

2.1 Emergency Token Generation Approach

Chosen Approach: Hybrid (Script-Based + Manual)

Rationale:

  • Developers need flexibility for local development (manual generation)
  • CI/CD requires programmatic validation and clear error messages
  • Security best practice: Don't auto-generate secrets that may be cached/logged

Implementation:

# Local generation (to be documented in README.md)
openssl rand -hex 32

# Alternative for systems without openssl
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"

# CI/CD validation (to be added to test setup)
if [ -z "$CHARON_EMERGENCY_TOKEN" ]; then
  echo "ERROR: CHARON_EMERGENCY_TOKEN not set. See .env.example for setup instructions."
  exit 1
fi

Token Characteristics:

  • Length: 64 characters (32 bytes hex-encoded)
  • Entropy: Cryptographically secure random bytes
  • Storage: .env file (local), GitHub Secrets (CI/CD)
  • Rotation: Manual rotation recommended quarterly

2.2 Environment File Management

File Structure:

# .env (gitignored - actual secrets)
CHARON_EMERGENCY_TOKEN=abc123...def789  # 64 chars

# .env.example (committed - documentation)
# Emergency token for security bypass (64 characters minimum)
# Generate with: openssl rand -hex 32
# REQUIRED for E2E tests
CHARON_EMERGENCY_TOKEN=your_64_character_emergency_token_here_replace_this_value

Update Strategy:

  1. Add placeholder to .env.example with generation instructions
  2. Update .gitignore to ensure .env is never committed
  3. Add validation to Playwright global setup to check token exists
  4. Document in README.md and docs/getting-started.md

2.3 Error Handling Improvements

Current Issue:

// Line 85 in tests/security-teardown.setup.ts
throw new Error(`Failed to reset security modules using emergency token:\n  ${errors.join('\n  ')}`);

Problem: errors may be undefined if emergency token request fails before errors array is populated.

Solution:

// Defensive programming with fallback
throw new Error(
  `Failed to reset security modules using emergency token:\n  ${
    (errors || ['Unknown error - check if CHARON_EMERGENCY_TOKEN is set in .env file']).join('\n  ')
  }`
);

Additional Improvements:

  • Add try-catch around emergency token loading
  • Validate token format (64 chars) before making request
  • Provide specific error messages for common failure modes

2.4 Test Refactoring: emergency-token.spec.ts

Problem: Test 1 attempts to create test data (access list) while ACL is enabled, causing 403 error.

Current Flow:

Test 1 Setup:
  → Create access list (blocked by ACL)
  → Test fails

Proposed Flow:

Test 1 Setup:
  → Use emergency token to temporarily disable ACL
  → Create access list
  → Re-enable ACL
  → Test emergency token bypass

Alternative Approach:

Test 1 Setup:
  → Skip access list creation
  → Use existing test data or mock data
  → Test emergency token bypass with minimal setup

Recommendation: Use Alternative Approach (simpler, less state mutation)

2.5 CI/CD Secret Management

GitHub Actions Integration:

# .github/workflows/e2e-tests.yml
env:
  CHARON_EMERGENCY_TOKEN: ${{ secrets.CHARON_EMERGENCY_TOKEN }}

jobs:
  e2e-tests:
    steps:
      - name: Validate Required Secrets
        run: |
          if [ -z "$CHARON_EMERGENCY_TOKEN" ]; then
            echo "::error::CHARON_EMERGENCY_TOKEN secret not configured"
            exit 1
          fi
          if [ ${#CHARON_EMERGENCY_TOKEN} -lt 64 ]; then
            echo "::error::CHARON_EMERGENCY_TOKEN must be at least 64 characters"
            exit 1
          fi

Secret Setup Instructions:

  1. Repository Settings → Secrets and Variables → Actions
  2. New repository secret: CHARON_EMERGENCY_TOKEN
  3. Value: Generate with openssl rand -hex 32
  4. Document in docs/github-setup.md

3. Implementation Tasks

Task 1: Generate Emergency Token and Update .env

Priority: HIGH Estimated Time: 5 minutes Dependencies: None

Steps:

  1. Generate emergency token:

    openssl rand -hex 32
    
  2. Add to .env file:

    echo "CHARON_EMERGENCY_TOKEN=$(openssl rand -hex 32)" >> .env
    
  3. Verify token is set:

    grep CHARON_EMERGENCY_TOKEN .env | wc -c  # Should output 88 (key + = + 64 chars + newline)
    

Validation:

  • .env file contains CHARON_EMERGENCY_TOKEN with 64-character value
  • Token is unique (not a placeholder value)
  • .env file is gitignored

Files Modified:

  • .env (add emergency token)

Task 2: Fix Error Handling in security-teardown.setup.ts

Priority: HIGH Estimated Time: 10 minutes Dependencies: None

File: tests/security-teardown.setup.ts Location: Line 85

Changes Required:

  1. Add defensive error handling at line 85:

    // OLD (line 85):
    throw new Error(`Failed to reset security modules using emergency token:\n  ${errors.join('\n  ')}`);
    
    // NEW:
    throw new Error(
      `Failed to reset security modules using emergency token:\n  ${
        (errors || ['Unknown error - ensure CHARON_EMERGENCY_TOKEN is set in .env file with a valid 64-character token']).join('\n  ')
      }`
    );
    
  2. Add token validation before emergency reset (around line 75-80):

    // Add before emergency reset attempt
    const emergencyToken = process.env.CHARON_EMERGENCY_TOKEN;
    if (!emergencyToken) {
      throw new Error(
        'CHARON_EMERGENCY_TOKEN is not set in .env file.\n' +
        'Generate one with: openssl rand -hex 32\n' +
        'Add to .env: CHARON_EMERGENCY_TOKEN=<your_64_char_token>'
      );
    }
    if (emergencyToken.length < 64) {
      throw new Error(
        `CHARON_EMERGENCY_TOKEN must be at least 64 characters (currently ${emergencyToken.length}).\n` +
        'Generate a new one with: openssl rand -hex 32'
      );
    }
    

Files Modified:

  • tests/security-teardown.setup.ts (lines 75-85)

Validation:

  • Script fails fast with clear error if token is missing
  • Script fails fast with clear error if token is too short
  • Script provides actionable error message if emergency reset fails

Task 3: Update .env.example with Token Placeholder

Priority: HIGH Estimated Time: 5 minutes Dependencies: None

File: .env.example

Changes Required:

  1. Add emergency token section:
    # ============================================================================
    # Emergency Security Token
    # ============================================================================
    # Required for E2E tests and emergency security bypass.
    # Generate a secure 64-character token with: openssl rand -hex 32
    # Alternative: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
    # SECURITY: Never commit actual token values to the repository.
    # SECURITY: Store actual value in .env (gitignored) or CI/CD secrets.
    CHARON_EMERGENCY_TOKEN=your_64_character_emergency_token_here_replace_this_value
    

Files Modified:

  • .env.example (add emergency token documentation)

Validation:

  • .env.example contains clear instructions
  • Instructions include multiple generation methods
  • Security warnings are prominent

Task 4: Refactor emergency-token.spec.ts Test 1

Priority: MEDIUM Estimated Time: 30 minutes Dependencies: Task 1, Task 2

File: tests/security-enforcement/emergency-token.spec.ts Location: Test 1 (around line 16)

Current Problem:

test('Test 1: Emergency token bypasses ACL', async ({ request }) => {
  // This fails because ACL is blocking the setup call
  const accessList = await testDataManager.createAccessList({
    name: 'Emergency Test ACL',
    // ...
  });
});

Solution: Simplify Test (Recommended):

test('Test 1: Emergency token bypasses ACL when ACL is blocking regular requests', async ({ request }) => {
  // Step 1: Verify ACL is enabled and blocking regular requests
  const regularResponse = await request.get(`${process.env.PLAYWRIGHT_BASE_URL}/api/security/status`);
  if (regularResponse.status() === 403) {
    console.log('✓ ACL is enabled and blocking regular requests (expected)');
  } else {
    console.warn('⚠ ACL may not be enabled - test may not be testing emergency bypass');
  }

  // Step 2: Use emergency token to bypass ACL
  const emergencyResponse = await request.get(
    `${process.env.PLAYWRIGHT_BASE_URL}/api/security/status`,
    {
      headers: {
        'X-Emergency-Token': process.env.CHARON_EMERGENCY_TOKEN
      }
    }
  );

  // Step 3: Verify emergency token bypassed ACL
  expect(emergencyResponse.ok()).toBe(true);
  expect(emergencyResponse.status()).toBe(200);

  const status = await emergencyResponse.json();
  expect(status).toHaveProperty('acl');
  console.log('✓ Emergency token successfully bypassed ACL');
});

Files Modified:

  • tests/security-enforcement/emergency-token.spec.ts (Test 1, lines ~16-50)

Validation:

  • Test passes when ACL is enabled
  • Test demonstrates emergency token bypass
  • Test does not require test data creation
  • Test is idempotent (can run multiple times)

Task 5: Add Playwright Global Setup Validation

Priority: HIGH Estimated Time: 15 minutes Dependencies: Task 1, Task 2

File: playwright.config.js

Changes Required:

  1. Add global setup script reference:

    // In playwright.config.js
    export default defineConfig({
      globalSetup: require.resolve('./tests/global-setup.ts'),
      // ... existing config
    });
    
  2. Create global setup file:

    // File: tests/global-setup.ts
    import * as dotenv from 'dotenv';
    
    export default async function globalSetup() {
      // Load environment variables
      dotenv.config();
    
      // Validate required environment variables
      const requiredEnvVars = {
        'CHARON_EMERGENCY_TOKEN': {
          minLength: 64,
          description: 'Emergency security token for test teardown and emergency bypass'
        }
      };
    
      const errors: string[] = [];
    
      for (const [varName, config] of Object.entries(requiredEnvVars)) {
        const value = process.env[varName];
    
        if (!value) {
          errors.push(
            `❌ ${varName} is not set.\n` +
            `   Description: ${config.description}\n` +
            `   Generate with: openssl rand -hex 32\n` +
            `   Add to .env file or set as environment variable`
          );
          continue;
        }
    
        if (config.minLength && value.length < config.minLength) {
          errors.push(
            `❌ ${varName} is too short (${value.length} chars, minimum ${config.minLength}).\n` +
            `   Generate a new one with: openssl rand -hex 32`
          );
        }
      }
    
      if (errors.length > 0) {
        console.error('\n🚨 Environment Configuration Errors:\n');
        errors.forEach(error => console.error(error + '\n'));
        console.error('📖 See .env.example and docs/getting-started.md for setup instructions.\n');
        process.exit(1);
      }
    
      console.log('✅ All required environment variables are configured correctly.\n');
    }
    

Files Created:

  • tests/global-setup.ts (new file)

Files Modified:

  • playwright.config.js (add globalSetup reference)

Validation:

  • Tests fail fast with clear error if token missing
  • Tests fail fast with clear error if token too short
  • Error messages provide actionable guidance
  • Success message confirms validation passed

Task 6: Add CI/CD Validation Check

Priority: HIGH Estimated Time: 10 minutes Dependencies: Task 1

File: .github/workflows/tests.yml (or equivalent E2E workflow)

Changes Required:

  1. Add secret validation step:
    jobs:
      e2e-tests:
        env:
          CHARON_EMERGENCY_TOKEN: ${{ secrets.CHARON_EMERGENCY_TOKEN }}
    
        steps:
          - name: Validate Emergency Token Configuration
            run: |
              if [ -z "$CHARON_EMERGENCY_TOKEN" ]; then
                echo "::error title=Missing Secret::CHARON_EMERGENCY_TOKEN secret not configured in repository settings"
                echo "::error::Navigate to: Repository Settings → Secrets and Variables → Actions"
                echo "::error::Create secret: CHARON_EMERGENCY_TOKEN"
                echo "::error::Generate value with: openssl rand -hex 32"
                echo "::error::See docs/github-setup.md for detailed instructions"
                exit 1
              fi
    
              TOKEN_LENGTH=${#CHARON_EMERGENCY_TOKEN}
              if [ $TOKEN_LENGTH -lt 64 ]; then
                echo "::error title=Invalid Token Length::CHARON_EMERGENCY_TOKEN must be at least 64 characters (current: $TOKEN_LENGTH)"
                echo "::error::Generate new token with: openssl rand -hex 32"
                exit 1
              fi
    
              echo "::notice::Emergency token validation passed (length: $TOKEN_LENGTH)"
    
          # ... rest of E2E test steps
    

Files Modified:

  • .github/workflows/tests.yml (add validation step before E2E tests)

Validation:

  • CI fails fast if secret not configured
  • CI fails fast if secret too short
  • Error annotations guide developers to fix
  • Success notice confirms validation

Task 7: Update Documentation

Priority: MEDIUM Estimated Time: 20 minutes Dependencies: Tasks 1-6

Files to Update:

1. README.md - Getting Started Section

Add to prerequisites:

### Environment Configuration

Before running the application or tests, configure required environment variables:

1. **Copy the example environment file:**
   ```bash
   cp .env.example .env
  1. Generate emergency security token:

    # Linux/macOS
    openssl rand -hex 32
    
    # Or with Node.js (all platforms)
    node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
    
  2. Add token to .env file:

    CHARON_EMERGENCY_TOKEN=<paste_64_character_token_here>
    
  3. Verify configuration:

    grep CHARON_EMERGENCY_TOKEN .env | wc -c  # Should output ~88
    

⚠️ Security: Never commit actual token values to the repository. The .env file is gitignored.


#### 2. `docs/getting-started.md` - Detailed Setup

**Add section:**
```markdown
## Emergency Token Configuration

The emergency token is a security feature that allows bypassing all security modules in emergency situations (e.g., lockout scenarios).

### Purpose
- Emergency access when ACL, WAF, or other security modules cause lockout
- Required for E2E test suite execution
- Audit logged when used

### Generation
```bash
# Linux/macOS (recommended)
openssl rand -hex 32

# Windows PowerShell
[Convert]::ToBase64String([System.Security.Cryptography.RandomNumberGenerator]::GetBytes(32))

# Node.js (all platforms)
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"

Local Development

Add to .env file:

CHARON_EMERGENCY_TOKEN=your_64_character_token_here

CI/CD (GitHub Actions)

  1. Navigate to: Repository Settings → Secrets and Variables → Actions
  2. Click "New repository secret"
  3. Name: CHARON_EMERGENCY_TOKEN
  4. Value: Generate with one of the methods above
  5. Click "Add secret"

See GitHub Setup Guide for detailed CI/CD configuration.

Rotation

  • Recommended: Quarterly rotation
  • After rotation: Update .env (local) and GitHub Secrets (CI/CD)
  • All environments must use the same token value

#### 3. `docs/troubleshooting/e2e-tests.md` - New File

**Create troubleshooting guide:**
```markdown
# E2E Test Troubleshooting

## Common Issues

### Error: "CHARON_EMERGENCY_TOKEN is not set"

**Symptom:** Tests fail immediately with environment configuration error.

**Cause:** Emergency token not configured in `.env` file.

**Solution:**
1. Generate token: `openssl rand -hex 32`
2. Add to `.env`: `CHARON_EMERGENCY_TOKEN=<token>`
3. Verify: `grep CHARON_EMERGENCY_TOKEN .env`

See: [Getting Started - Emergency Token Configuration](../getting-started.md#emergency-token-configuration)

---

### Error: "Failed to reset security modules using emergency token"

**Symptom:** Security teardown fails, causing cascading test failures.

**Possible Causes:**
1. Emergency token too short (< 64 chars)
2. Emergency token doesn't match backend configuration
3. Backend not running or unreachable

**Solution:**
1. Verify token length: `echo -n "$CHARON_EMERGENCY_TOKEN" | wc -c` (should be 64)
2. Regenerate if needed: `openssl rand -hex 32`
3. Verify backend is running: `curl http://localhost:8080/health`
4. Check backend logs for token validation errors

---

### Error: "Blocked by access control list" (403)

**Symptom:** Most tests fail with 403 errors.

**Cause:** Security teardown did not successfully disable ACL before tests.

**Solution:**
1. Ensure emergency token is configured (see above)
2. Run teardown script manually: `npx playwright test tests/security-teardown.setup.ts`
3. Check teardown output for errors
4. Verify backend emergency token matches test token

---

### Tests Pass Locally but Fail in CI/CD

**Symptom:** Tests work locally but fail in GitHub Actions.

**Cause:** `CHARON_EMERGENCY_TOKEN` not configured in GitHub Secrets.

**Solution:**
1. Navigate to: Repository Settings → Secrets and Variables → Actions
2. Verify `CHARON_EMERGENCY_TOKEN` secret exists
3. If missing, create it (see [GitHub Setup](../github-setup.md))
4. Verify secret value is 64 characters minimum
5. Re-run workflow

---

## Debug Mode

Run tests with full debugging:
```bash
# With Playwright inspector
npx playwright test --debug

# With full traces
npx playwright test --trace=on

# View trace after test
npx playwright show-trace test-results/traces/*.zip

Getting Help

  1. Check E2E Test Triage Report for known issues
  2. Review Playwright Documentation
  3. Check test logs in test-results/ directory
  4. Contact team or open GitHub issue

**Files Created:**
- `docs/troubleshooting/e2e-tests.md` (new file)

**Files Modified:**
- `README.md` (add environment configuration section)
- `docs/getting-started.md` (add emergency token section)
- `docs/github-setup.md` (add emergency token secret setup)

**Validation:**
- Documentation is clear and actionable
- Multiple generation methods provided
- Troubleshooting guide covers common errors
- CI/CD setup is documented

---

## 4. Validation Criteria

### 4.1 Primary Success Criteria

**Test Pass Rate Target:** 99% (157/159 tests passing)

**Verification Steps:**

1. **Run full E2E test suite:**
   ```bash
   npx playwright test --project=chromium
  1. Verify expected results:

    • Security teardown test passes
    • 20 previously failing tests now pass (ACL, WAF, CrowdSec, Rate Limit, Combined)
    • Emergency token Test 1 passes (after refactor)
    • All other tests remain passing (116 tests)
    • Maximum 2 failures acceptable (reserved for unrelated issues)
  2. Check test output:

    # Should show ~157 passed, 0-2 failed
    # Total execution time should be similar (~3-4 minutes)
    

4.2 Task-Specific Validation

Task 1: Emergency Token Generation

Pass Criteria:

  • .env file contains CHARON_EMERGENCY_TOKEN
  • Token value is exactly 64 characters
  • Token is unique (not a placeholder or example value)
  • .env file is in .gitignore
  • Command grep CHARON_EMERGENCY_TOKEN .env | wc -c outputs ~88

Test Command:

if grep -q "^CHARON_EMERGENCY_TOKEN=[a-f0-9]{64}$" .env; then
  echo "✅ Emergency token configured correctly"
else
  echo "❌ Emergency token missing or invalid format"
fi

Task 2: Error Handling Fix

Pass Criteria:

  • Security teardown script runs without TypeError
  • Missing token produces clear error message with generation instructions
  • Short token (<64 chars) produces clear error message
  • Error messages are actionable (tell user what to do)

Test Command:

# Test with missing token
unset CHARON_EMERGENCY_TOKEN
npx playwright test tests/security-teardown.setup.ts 2>&1 | grep "ensure CHARON_EMERGENCY_TOKEN is set"

# Should output error message about missing token

Task 3: .env.example Update

Pass Criteria:

  • .env.example contains CHARON_EMERGENCY_TOKEN placeholder
  • Placeholder value is clearly not valid (e.g., contains "replace_this")
  • Generation instructions using openssl rand -hex 32 are present
  • Alternative generation method is documented
  • Security warnings are present

Test Command:

grep -A 5 "CHARON_EMERGENCY_TOKEN" .env.example | grep "openssl rand"
# Should show generation command

Task 4: Test Refactoring

Pass Criteria:

  • Emergency token Test 1 passes independently
  • Test does not attempt to create test data during setup
  • Test demonstrates emergency token bypass functionality
  • Test is idempotent (can run multiple times)
  • Test provides clear console output of actions

Test Command:

npx playwright test tests/security-enforcement/emergency-token.spec.ts --grep "Test 1"
# Should pass with clear output

Task 5: Global Setup Validation

Pass Criteria:

  • tests/global-setup.ts file exists
  • playwright.config.js references global setup
  • Tests fail fast if token missing (before running any tests)
  • Error message includes generation instructions
  • Success message confirms validation passed

Test Command:

# Test with missing token
unset CHARON_EMERGENCY_TOKEN
npx playwright test 2>&1 | head -20
# Should fail immediately with clear error, not run tests

Task 6: CI/CD Validation

Pass Criteria:

  • Workflow file includes secret validation step
  • Validation runs before E2E tests
  • Missing secret produces GitHub error annotation
  • Short token produces GitHub error annotation
  • Error annotations include actionable guidance

Test Command:

# Review workflow file
grep -A 20 "Validate Emergency Token" .github/workflows/*.yml

Task 7: Documentation Updates

Pass Criteria:

  • README.md includes environment configuration section
  • docs/getting-started.md includes emergency token section
  • docs/troubleshooting/e2e-tests.md created with common issues
  • All documentation uses consistent generation commands
  • Security warnings are prominent
  • Multiple generation methods provided (Linux, Windows, Node.js)

Test Command:

grep -r "openssl rand -hex 32" docs/ README.md
# Should find multiple occurrences

4.3 Regression Testing

Verify No Unintended Side Effects:

  1. Unit Tests Still Pass:

    npm run test:backend
    npm run test:frontend
    # Both should pass without changes
    
  2. Other E2E Tests Unaffected:

    npx playwright test tests/manual-dns-provider.spec.ts
    # Verify unrelated tests still pass
    
  3. Security Modules Function Correctly:

    # Start application
    docker-compose up -d
    
    # Enable ACL
    curl -X PATCH http://localhost:8080/api/security/acl \
      -H "Content-Type: application/json" \
      -d '{"enabled": true}'
    
    # Verify 403 without auth
    curl -v http://localhost:8080/api/security/status
    
    # Verify 200 with emergency token
    curl -v http://localhost:8080/api/security/status \
      -H "X-Emergency-Token: $CHARON_EMERGENCY_TOKEN"
    
  4. Performance Not Impacted:

    • Test execution time remains ~3-4 minutes
    • No significant increase in setup time
    • Global setup validation adds <1 second

4.4 Code Quality Checks

Pass Criteria:

  • All linting passes: npm run lint
  • TypeScript compilation succeeds: npm run type-check
  • No new security vulnerabilities: npm audit
  • Pre-commit hooks pass: pre-commit run --all-files

5. CI/CD Integration

5.1 GitHub Actions Secret Configuration

Setup Steps:

  1. Navigate to Repository Settings:

    • Go to: https://github.com/<org>/<repo>/settings/secrets/actions
    • Or: Repository → Settings → Secrets and Variables → Actions
  2. Create Emergency Token Secret:

    • Click "New repository secret"
    • Name: CHARON_EMERGENCY_TOKEN
    • Value: Generate with openssl rand -hex 32
    • Click "Add secret"
  3. Verify Secret is Set:

    • Secret should appear in list (value is masked)
    • Note: Secret can be updated but not viewed after creation

5.2 Workflow Integration

Workflow File Update:

# .github/workflows/tests.yml (or e2e-tests.yml)

name: E2E Tests

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main, develop]

jobs:
  e2e-tests:
    runs-on: ubuntu-latest

    env:
      # Make secret available to all steps
      CHARON_EMERGENCY_TOKEN: ${{ secrets.CHARON_EMERGENCY_TOKEN }}
      PLAYWRIGHT_BASE_URL: http://localhost:8080

    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      # CRITICAL: Validate secrets before proceeding
      - name: Validate Emergency Token Configuration
        run: |
          if [ -z "$CHARON_EMERGENCY_TOKEN" ]; then
            echo "::error title=Missing Secret::CHARON_EMERGENCY_TOKEN not configured"
            echo "::error::Setup: Repository Settings → Secrets → New secret"
            echo "::error::Name: CHARON_EMERGENCY_TOKEN"
            echo "::error::Value: Generate with 'openssl rand -hex 32'"
            echo "::error::Documentation: docs/github-setup.md"
            exit 1
          fi

          TOKEN_LENGTH=${#CHARON_EMERGENCY_TOKEN}
          if [ $TOKEN_LENGTH -lt 64 ]; then
            echo "::error title=Invalid Token::Token too short ($TOKEN_LENGTH chars, need 64+)"
            exit 1
          fi

          echo "::notice::Emergency token validated (length: $TOKEN_LENGTH)"

      - name: Install Dependencies
        run: npm ci

      - name: Install Playwright Browsers
        run: npx playwright install --with-deps chromium

      - name: Start Docker Environment
        run: docker-compose up -d

      - name: Wait for Application
        run: |
          timeout 60 bash -c 'until curl -f http://localhost:8080/health; do sleep 2; done'

      - name: Run E2E Tests
        run: npx playwright test --project=chromium

      - name: Upload Test Results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 30

      - name: Upload Coverage (if applicable)
        if: always()
        uses: codecov/codecov-action@v4
        with:
          files: ./coverage/e2e/lcov.info
          flags: e2e

5.3 Secret Rotation Process

When to Rotate:

  • Quarterly (recommended)
  • After suspected compromise
  • After team member departure (if they had access)
  • As part of security audits

Rotation Steps:

  1. Generate New Token:

    openssl rand -hex 32 > new_emergency_token.txt
    
  2. Update Local Environment:

    # Backup old token
    grep CHARON_EMERGENCY_TOKEN .env > old_token_backup.txt
    
    # Update .env
    sed -i "s/CHARON_EMERGENCY_TOKEN=.*/CHARON_EMERGENCY_TOKEN=$(cat new_emergency_token.txt)/" .env
    
  3. Update GitHub Secret:

    • Navigate to: Repository Settings → Secrets → Actions
    • Click on CHARON_EMERGENCY_TOKEN
    • Click "Update secret"
    • Paste new token value
    • Click "Update secret"
  4. Update Backend Configuration:

    • If backend stores token in environment/config, update there too
    • Restart backend services
  5. Verify:

    # Run E2E tests locally
    npx playwright test tests/security-teardown.setup.ts
    
    # Trigger CI/CD run
    git commit --allow-empty -m "test: verify emergency token rotation"
    git push
    
  6. Secure Deletion:

    shred -u new_emergency_token.txt old_token_backup.txt
    

5.4 Security Best Practices

DO:

  • Use GitHub Secrets for token storage in CI/CD
  • Rotate tokens quarterly or after security events
  • Validate token format before using (length, characters)
  • Use cryptographically secure random generation
  • Document token rotation process
  • Audit log all emergency token usage (backend feature)

DON'T:

  • Commit tokens to repository (even in example files)
  • Share tokens via email or chat
  • Use weak or predictable token values
  • Store tokens in CI/CD logs or build artifacts
  • Reuse tokens across environments (dev, staging, prod)
  • Bypass token validation "just to make it work"

5.5 Monitoring and Alerting

Recommended Monitoring:

  1. Test Failure Alerts:

    # In workflow file
    - name: Notify on Failure
      if: failure()
      uses: actions/github-script@v7
      with:
        script: |
          github.rest.issues.create({
            owner: context.repo.owner,
            repo: context.repo.repo,
            title: 'E2E Tests Failed',
            body: 'E2E tests failed. Check workflow run for details.',
            labels: ['testing', 'e2e', 'automation']
          });
    
  2. Token Expiration Reminders:

    • Set calendar reminders for quarterly rotation
    • Document last rotation date in docs/security/token-rotation-log.md
  3. Audit Emergency Token Usage:

    • Backend should log all emergency token usage
    • Review logs regularly for unauthorized access
    • Alert on unexpected emergency token usage in production

6. Risk Assessment and Mitigation

6.1 Identified Risks

Risk Severity Likelihood Impact Mitigation
Token leaked in logs HIGH LOW Unauthorized bypass of security Mask token in logs, never echo full value
Token committed to repo HIGH MEDIUM Public exposure if repo public Pre-commit hooks, .gitignore, code review
Token not rotated MEDIUM HIGH Stale credentials increase risk Quarterly rotation schedule, documentation
CI/CD secret not set LOW MEDIUM Tests fail, blocking deployments Validation step, clear error messages
Token too weak MEDIUM LOW Vulnerable to brute force Enforce 64-char minimum, use crypto RNG
Inconsistent tokens across envs LOW MEDIUM Tests pass locally, fail in CI Documentation, validation, troubleshooting guide

6.2 Mitigation Implementation

Token Leakage Prevention:

# In workflow files and scripts, never echo full token
echo "Token length: ${#CHARON_EMERGENCY_TOKEN}"  # OK
echo "Token: $CHARON_EMERGENCY_TOKEN"             # NEVER DO THIS

Pre-Commit Hook:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    hooks:
      - id: detect-private-key
      - id: check-added-large-files

  - repo: https://github.com/Yelp/detect-secrets
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

Rotation Tracking:

<!-- docs/security/token-rotation-log.md -->
# Emergency Token Rotation Log

| Date       | Rotated By | Reason        | Environments Updated |
|------------|------------|---------------|---------------------|
| 2026-01-27 | DevOps     | Initial setup | Local, CI/CD        |
| 2026-04-27 | DevOps     | Quarterly     | Local, CI/CD        |

7. Success Metrics

7.1 Quantitative Metrics

Metric Baseline Target Post-Fix
Test Pass Rate 73% (116/159) 99% (157/159) TBD
Failed Tests 21 ≤ 2 TBD
Security Test Pass Rate 0% (0/20) 100% (20/20) TBD
Setup Time N/A < 10 mins TBD
CI/CD Test Duration ~4 mins ~4 mins (no regression) TBD

7.2 Qualitative Metrics

Aspect Current State Target State Post-Fix
Developer Experience Confusing errors Clear, actionable errors TBD
Documentation Incomplete Comprehensive TBD
Error Messages Generic TypeErrors Specific guidance TBD
CI/CD Reliability Failing Consistently passing TBD
Onboarding Time Unknown < 30 mins TBD

7.3 Validation Checklist

Before Declaring Success:

  • All 7 implementation tasks completed
  • Primary validation criteria met (99% pass rate)
  • Task-specific validation passed for all tasks
  • Regression tests passed (no unintended side effects)
  • Code quality checks passed
  • Documentation reviewed and accurate
  • CI/CD secret configured and tested
  • Developer experience improved (team feedback)
  • Troubleshooting guide tested with common errors

8. Rollout Plan

Phase 1: Local Fix (Day 1)

Time: 1 hour

  1. Quick Wins (30 minutes):

    • Generate emergency token and add to local .env (Task 1)
    • Fix error handling in security-teardown.setup.ts (Task 2)
    • Update .env.example (Task 3)
    • Run tests to validate 20/21 failures resolved
  2. Validation (30 minutes):

    • Run full E2E test suite
    • Verify 157/159 tests pass (or better)
    • Document any remaining issues

Phase 2: Test Improvements (Day 1-2)

Time: 1-2 hours

  1. Test Refactoring (1 hour):

    • Refactor emergency-token.spec.ts Test 1 (Task 4)
    • Add global setup validation (Task 5)
    • Run tests to validate 159/159 pass
  2. CI/CD Integration (30 minutes):

    • Add validation step to workflow (Task 6)
    • Configure GitHub secret
    • Trigger CI/CD run to validate

Phase 3: Documentation & Hardening (Day 2-3)

Time: 2-3 hours

  1. Documentation (2 hours):

    • Update README.md (Task 7)
    • Update docs/getting-started.md (Task 7)
    • Create docs/troubleshooting/e2e-tests.md (Task 7)
    • Update docs/github-setup.md (Task 7)
  2. Team Review (1 hour):

    • Code review of all changes
    • Test documentation with fresh developer
    • Gather feedback on error messages
    • Refine based on feedback

Phase 4: Deployment & Monitoring (Day 3-4)

Time: 1 hour + ongoing monitoring

  1. Merge Changes:

    • Create pull request with all changes
    • Ensure CI/CD passes
    • Merge to main branch
  2. Team Rollout:

    • Announce changes in team channel
    • Share setup instructions
    • Monitor for issues or questions
  3. Monitoring (Ongoing):

    • Watch CI/CD test results
    • Collect developer feedback
    • Track token rotation schedule
    • Review audit logs for emergency token usage

9. Appendix

B. Command Reference

Emergency Token Generation:

# Linux/macOS
openssl rand -hex 32

# Windows PowerShell
[Convert]::ToBase64String([System.Security.Cryptography.RandomNumberGenerator]::GetBytes(32))

# Node.js (all platforms)
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"

# Verification
echo -n "$CHARON_EMERGENCY_TOKEN" | wc -c  # Should output 64

Test Execution:

# Run security teardown only
npx playwright test tests/security-teardown.setup.ts

# Run full E2E suite
npx playwright test --project=chromium

# Run specific test file
npx playwright test tests/security-enforcement/emergency-token.spec.ts

# Run with debug
npx playwright test --debug

# Run with traces
npx playwright test --trace=on

# View test report
npx playwright show-report

Validation Commands:

# Check token in .env
grep CHARON_EMERGENCY_TOKEN .env

# Validate token length
grep CHARON_EMERGENCY_TOKEN .env | cut -d= -f2 | wc -c

# Test emergency token API
curl -v http://localhost:8080/api/security/status \
  -H "X-Emergency-Token: $CHARON_EMERGENCY_TOKEN"

# Run linting
npm run lint

# Run type checking
npm run type-check

C. Error Message Reference

Missing Token:

❌ CHARON_EMERGENCY_TOKEN is not set.
   Description: Emergency security token for test teardown and emergency bypass
   Generate with: openssl rand -hex 32
   Add to .env file or set as environment variable

Short Token:

❌ CHARON_EMERGENCY_TOKEN is too short (32 chars, minimum 64).
   Generate a new one with: openssl rand -hex 32

Security Teardown Failure:

TypeError: Cannot read properties of undefined (reading 'join')
    at file:///projects/Charon/tests/security-teardown.setup.ts:85:60

Fix: Ensure CHARON_EMERGENCY_TOKEN is set in .env file with a valid 64-character token

D. Contacts and Escalation

Questions or Issues:

  • Review documentation first (README.md, docs/getting-started.md)
  • Check troubleshooting guide (docs/troubleshooting/e2e-tests.md)
  • Review E2E triage report (docs/reports/e2e_triage_report.md)

Still Stuck:

  • Open GitHub issue with testing and e2e labels
  • Include error messages, environment details, steps to reproduce
  • Tag @team-devops or @team-qa

Security Concerns:

  • Do NOT post tokens or secrets in issues
  • Email security@company.com for security-related questions
  • Follow responsible disclosure guidelines

Document History

Version Date Author Changes
1.0 2026-01-27 GitHub Copilot Initial specification based on E2E triage report

Status: ACTIVE - Ready for Implementation Next Review: After implementation completion Estimated Completion: 2026-01-28 (< 2 days total effort)