# WAF Integration Workflow Fix: wget-style curl Syntax Migration **Plan ID**: WAF-2026-001 **Status**: 📋 PENDING **Priority**: High **Created**: 2026-01-25 **Scope**: Fix integration test scripts using incorrect wget-style curl syntax --- ## Problem Summary After migrating the Docker base image from Alpine to Debian Trixie (PR #550), the WAF integration workflow is failing. The root cause is **not** a missing `wget` command, but rather several integration test scripts using **wget-style options with curl** that don't work correctly. ### Root Cause Multiple scripts use `curl -q -O-` which is **wget syntax, not curl syntax**: | Syntax | Tool | Meaning | |--------|------|---------| | `-q` | **wget** | Quiet mode | | `-q` | **curl** | **Invalid** - does nothing useful | | `-O-` | **wget** | Output to stdout | | `-O-` | **curl** | **Wrong** - `-O` means "save with remote filename", `-` is treated as a separate URL | The correct curl equivalents are: | wget | curl | Notes | |------|------|-------| | `wget -q` | `curl -s` | Silent mode | | `wget -O-` | `curl -s` | stdout is curl's default output | | `wget -q -O- URL` | `curl -s URL` | Full equivalent | | `wget -O filename` | `curl -o filename` | Note: lowercase `-o` in curl | --- ## Files Requiring Changes ### Priority 1: Integration Test Scripts (Blocking WAF Workflow) | File | Line | Current Code | Issue | |------|------|--------------|-------| | [scripts/waf_integration.sh](../../scripts/waf_integration.sh#L205) | 205 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax | | [scripts/cerberus_integration.sh](../../scripts/cerberus_integration.sh#L214) | 214 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax | | [scripts/rate_limit_integration.sh](../../scripts/rate_limit_integration.sh#L190) | 190 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax | | [scripts/crowdsec_startup_test.sh](../../scripts/crowdsec_startup_test.sh#L178) | 178 | `curl -q -O- http://127.0.0.1:8085/health` | wget syntax | ### Priority 2: Utility Scripts | File | Line | Current Code | Issue | |------|------|--------------|-------| | [scripts/install-go-1.25.5.sh](../../scripts/install-go-1.25.5.sh#L18) | 18 | `curl -q -O "$TMPFILE" "URL"` | Wrong syntax - `-O` doesn't take an argument in curl | --- ## Detailed Fixes ### Fix 1: scripts/waf_integration.sh (Line 205) **Current (broken):** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` **Fixed:** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` **Notes:** - `-s` = silent (no progress meter) - `-f` = fail silently on HTTP errors (returns non-zero exit code) - Removed redundant fallback since the fix makes the command work correctly --- ### Fix 2: scripts/cerberus_integration.sh (Line 214) **Current (broken):** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` **Fixed:** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` --- ### Fix 3: scripts/rate_limit_integration.sh (Line 190) **Current (broken):** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` **Fixed:** ```bash if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then ``` --- ### Fix 4: scripts/crowdsec_startup_test.sh (Line 178) **Current (broken):** ```bash LAPI_HEALTH=$(docker exec ${CONTAINER_NAME} curl -q -O- http://127.0.0.1:8085/health 2>/dev/null || echo "FAILED") ``` **Fixed:** ```bash LAPI_HEALTH=$(docker exec ${CONTAINER_NAME} curl -sf http://127.0.0.1:8085/health 2>/dev/null || echo "FAILED") ``` --- ### Fix 5: scripts/install-go-1.25.5.sh (Line 18) **Current (broken):** ```bash curl -q -O "$TMPFILE" "https://go.dev/dl/${TARFILE}" ``` **Fixed:** ```bash curl -sSfL -o "$TMPFILE" "https://go.dev/dl/${TARFILE}" ``` **Notes:** - `-s` = silent - `-S` = show errors even in silent mode - `-f` = fail on HTTP errors - `-L` = follow redirects (important for go.dev downloads) - `-o filename` = output to specified file (lowercase `-o`) --- ## Verification Commands After applying fixes, verify each script works: ```bash # Test WAF integration ./scripts/waf_integration.sh # Test Cerberus integration ./scripts/cerberus_integration.sh # Test Rate Limit integration ./scripts/rate_limit_integration.sh # Test CrowdSec startup ./scripts/crowdsec_startup_test.sh # Verify Go install script syntax bash -n ./scripts/install-go-1.25.5.sh ``` --- ## Behavior Differences: wget vs curl When migrating from wget to curl, be aware of these differences: | Behavior | wget | curl | |----------|------|------| | Output destination | File by default | stdout by default | | Follow redirects | Yes by default | Requires `-L` flag | | Retry on failure | Built-in retry | Requires `--retry N` | | Progress display | Text progress bar | Progress meter (use `-s` to hide) | | HTTP error handling | Non-zero exit on 404 | Requires `-f` for non-zero exit on HTTP errors | | Quiet mode | `-q` | `-s` (silent) | | Output to file | `-O filename` (uppercase) | `-o filename` (lowercase) | | Save with remote name | `-O` (no arg) | `-O` (uppercase, no arg) | --- ## Execution Checklist - [ ] **Fix 1**: Update `scripts/waf_integration.sh` line 205 - [ ] **Fix 2**: Update `scripts/cerberus_integration.sh` line 214 - [ ] **Fix 3**: Update `scripts/rate_limit_integration.sh` line 190 - [ ] **Fix 4**: Update `scripts/crowdsec_startup_test.sh` line 178 - [ ] **Fix 5**: Update `scripts/install-go-1.25.5.sh` line 18 - [ ] **Verify**: Run each integration test locally - [ ] **CI**: Confirm WAF integration workflow passes --- ## Notes 1. **Deprecated Scripts**: Several affected scripts are marked deprecated (will be removed in v2.0.0). However, they are still used by CI workflows, so fixes are required. 2. **Skill-Based Replacements**: The `.github/skills/scripts/` directory was checked and contains no wget usage - those scripts already use correct curl syntax. 3. **Docker Compose Files**: All health checks in docker-compose files already use correct curl syntax (`curl -f`, `curl -fsS`). 4. **Dockerfile**: The main Dockerfile correctly installs `curl` and uses correct curl syntax in the HEALTHCHECK instruction. --- # Previous Plan (Archived) The previous Git & Workflow Recovery Plan has been archived below. --- # Git & Workflow Recovery Plan (ARCHIVED) **Plan ID**: GIT-2026-001 **Status**: ✅ ARCHIVED **Priority**: High **Created**: 2026-01-25 **Scope**: Git recovery, Renovate fix, Workflow simplification --- ## Problem Summary 1. **Git State**: Feature branch `feature/beta-release` is in a broken rebase state 2. **Renovate**: Targeting feature branches creates orphaned PRs and merge conflicts 3. **Propagate Workflow**: Overly complex cascade (`main → development → nightly → feature/*`) causes confusion 4. **Nightly Branch**: Unnecessary intermediate branch adding complexity --- ## Phase 1: Git Recovery ### Step 1.1 — Abort the Rebase ```bash # Check current state git status # Abort the in-progress rebase git rebase --abort # Verify clean state git status ``` ### Step 1.2 — Fetch Latest from Origin ```bash # Fetch all branches git fetch origin --prune # Ensure we're on the feature branch git checkout feature/beta-release ``` ### Step 1.3 — Merge Development into Feature Branch **Use merge, NOT rebase** to preserve commit history and avoid force-push issues. ```bash # Merge development into feature/beta-release git merge origin/development --no-ff -m "Merge development into feature/beta-release" ``` ### Step 1.4 — Resolve Conflicts (if any) Likely conflict files based on Renovate activity: - `package.json` / `package-lock.json` (version bumps) - `backend/go.mod` / `backend/go.sum` (Go dependency updates) - `.github/workflows/*.yml` (action digest pins) **Resolution strategy:** ```bash # For package.json - accept development's versions, then run npm install git checkout --theirs package.json package-lock.json npm install git add package.json package-lock.json # For go.mod/go.sum - accept development's versions, then tidy git checkout --theirs backend/go.mod backend/go.sum cd backend && go mod tidy && cd .. git add backend/go.mod backend/go.sum # For workflow files - usually safe to accept development git checkout --theirs .github/workflows/ # Complete the merge git commit ``` ### Step 1.5 — Push the Merged Branch ```bash git push origin feature/beta-release ``` --- ## Phase 2: Renovate Fix ### Problem Current config in `.github/renovate.json`: ```json "baseBranches": [ "development", "feature/beta-release" ] ``` This causes: - Duplicate PRs for the same dependency (one per branch) - Orphaned branches like `renovate/feature/beta-release-*` when feature merges - Constant merge conflicts between branches ### Solution Only target `development`. Changes flow naturally via propagate workflow. ### Old Config (REMOVE) ```json { "baseBranches": [ "development", "feature/beta-release" ], ... } ``` ### New Config (REPLACE WITH) ```json { "baseBranches": [ "development" ], ... } ``` ### File to Edit **File**: `.github/renovate.json` **Line**: ~12-15 --- ## Phase 3: Propagate Workflow Fix ### Problem Current workflow in `.github/workflows/propagate-changes.yml`: ```yaml on: push: branches: - main - development - nightly # <-- Unnecessary ``` Cascade logic: - `main` → `development` ✅ (Correct) - `development` → `nightly` ❌ (Unnecessary) - `nightly` → `feature/*` ❌ (Overly complex) ### Solution Simplify to **only** `main → development` propagation. ### Old Trigger (REMOVE) ```yaml on: push: branches: - main - development - nightly ``` ### New Trigger (REPLACE WITH) ```yaml on: push: branches: - main ``` ### Old Script Logic (REMOVE) ```javascript if (currentBranch === 'main') { // Main -> Development await createPR('main', 'development'); } else if (currentBranch === 'development') { // Development -> Nightly await createPR('development', 'nightly'); } else if (currentBranch === 'nightly') { // Nightly -> Feature branches const branches = await github.paginate(github.rest.repos.listBranches, { owner: context.repo.owner, repo: context.repo.repo, }); const featureBranches = branches .map(b => b.name) .filter(name => name.startsWith('feature/')); core.info(`Found ${featureBranches.length} feature branches: ${featureBranches.join(', ')}`); for (const featureBranch of featureBranches) { await createPR('development', featureBranch); } } ``` ### New Script Logic (REPLACE WITH) ```javascript if (currentBranch === 'main') { // Main -> Development (only propagation needed) await createPR('main', 'development'); } ``` ### File to Edit **File**: `.github/workflows/propagate-changes.yml` --- ## Phase 4: Cleanup ### Step 4.1 — Delete Nightly Branch ```bash # Delete remote nightly branch (if exists) git push origin --delete nightly 2>/dev/null || echo "nightly branch does not exist" # Delete local tracking branch git branch -D nightly 2>/dev/null || true ``` ### Step 4.2 — Delete Orphaned Renovate Branches ```bash # List all renovate branches targeting feature/beta-release git fetch origin git branch -r | grep 'renovate/feature/beta-release' | while read branch; do remote_branch="${branch#origin/}" echo "Deleting: $remote_branch" git push origin --delete "$remote_branch" done ``` ### Step 4.3 — Close Orphaned Renovate PRs After branches are deleted, any associated PRs will be automatically closed by GitHub. --- ## Execution Checklist - [ ] **Phase 1**: Git Recovery - [ ] 1.1 Abort rebase - [ ] 1.2 Fetch latest - [ ] 1.3 Merge development - [ ] 1.4 Resolve conflicts - [ ] 1.5 Push merged branch - [ ] **Phase 2**: Renovate Fix - [ ] Edit `.github/renovate.json` - remove `feature/beta-release` from baseBranches - [ ] Commit and push - [ ] **Phase 3**: Propagate Workflow Fix - [ ] Edit `.github/workflows/propagate-changes.yml` - simplify triggers and logic - [ ] Commit and push - [ ] **Phase 4**: Cleanup - [ ] 4.1 Delete nightly branch - [ ] 4.2 Delete orphaned `renovate/feature/beta-release-*` branches - [ ] 4.3 Verify orphaned PRs are closed --- ## Verification After all phases complete: ```bash # Confirm no rebase in progress git status # Expected: "On branch feature/beta-release" with clean state # Confirm nightly deleted git branch -r | grep nightly # Expected: no output # Confirm orphaned renovate branches deleted git branch -r | grep 'renovate/feature/beta-release' # Expected: no output # Confirm Renovate config only targets development cat .github/renovate.json | grep -A2 baseBranches # Expected: only "development" ``` --- ## Rollback Plan If issues occur: 1. **Git Recovery Failed**: ```bash git fetch origin git checkout feature/beta-release git reset --hard origin/feature/beta-release ``` 2. **Renovate Changes Broke Something**: Revert the commit to `.github/renovate.json` 3. **Propagate Workflow Issues**: Revert the commit to `.github/workflows/propagate-changes.yml` --- ## Archived Spec (Prior Implementation) # Security Fix: Remove Hardcoded Encryption Keys from Docker Compose Files **Plan ID**: SEC-2026-001 **Status**: ✅ IMPLEMENTED **Priority**: Critical (Security) **Created**: 2026-01-25 **Implemented By**: Management Agent --- ### Summary Removed hardcoded encryption keys from Docker Compose test files and implemented ephemeral key generation in CI workflows. ### Changes Applied | File | Change | |------|--------| | `.docker/compose/docker-compose.playwright.yml` | Replaced hardcoded key with `${CHARON_ENCRYPTION_KEY:?...}` | | `.docker/compose/docker-compose.e2e.yml` | Replaced hardcoded key with `${CHARON_ENCRYPTION_KEY:?...}` | | `.github/workflows/e2e-tests.yml` | Added ephemeral key generation step | | `.env.test.example` | Added prominent documentation | ### Security Notes - The old key `ucDWy5ScLubd3QwCHhQa2SY7wL2OF48p/c9nZhyW1mA=` exists in git history - This key should **NEVER** be used in any production environment - Each CI run now generates a unique ephemeral key ### Testing ```bash # Verify compose fails without key unset CHARON_ENCRYPTION_KEY docker compose -f .docker/compose/docker-compose.playwright.yml config 2>&1 # Expected: "CHARON_ENCRYPTION_KEY is required" # Verify compose succeeds with key export CHARON_ENCRYPTION_KEY=$(openssl rand -base64 32) docker compose -f .docker/compose/docker-compose.playwright.yml config # Expected: Valid YAML output ``` ### References - **OWASP**: [A02:2021 – Cryptographic Failures](https://owasp.org/Top10/A02_2021-Cryptographic_Failures/) --- # Playwright Security Test Helpers **Plan ID**: E2E-SEC-001 **Status**: ✅ COMPLETED **Priority**: Critical (Blocking 230/707 E2E test failures) **Created**: 2026-01-25 **Completed**: 2026-01-25 **Scope**: Add security test helpers to prevent ACL deadlock in E2E tests --- ## Completion Notes **Implementation Summary:** - Created `tests/utils/security-helpers.ts` with full security state management utilities - Functions implemented: `getSecurityStatus`, `setSecurityModuleEnabled`, `captureSecurityState`, `restoreSecurityState`, `withSecurityEnabled`, `disableAllSecurityModules` - Pattern enables guaranteed cleanup via Playwright's `test.afterAll()` fixture **Documentation:** - See [Security Test Helpers Guide](../testing/security-helpers.md) for usage examples --- ## Problem Summary During E2E testing, if ACL is left enabled from a previous test run (e.g., due to test failure), it can create a **deadlock**: 1. ACL blocks API requests → returns 403 Forbidden 2. Global cleanup can't run → API blocked 3. Auth setup fails → tests skip 4. Manual intervention required to reset volumes **Root Cause Analysis:** - `security-dashboard.spec.ts` has tests that toggle ACL, WAF, and Rate Limiting - The tests attempt to "toggle back" but if a test fails mid-execution, cleanup doesn't run - Playwright's `test.afterAll` with fixtures guarantees cleanup even on failure - The current tests don't use fixtures for security state management ## Solution Architecture ### API Endpoints (Backend Already Supports) | Endpoint | Method | Purpose | |----------|--------|---------| | `/api/v1/security/status` | GET | Returns current state of all security modules | | `/api/v1/settings` | POST | Toggle settings with `{ key: "security.acl.enabled", value: "true/false" }` | ### Settings Keys | Key | Values | Description | |-----|--------|-------------| | `security.acl.enabled` | `"true"` / `"false"` | Toggle ACL enforcement | | `security.waf.enabled` | `"true"` / `"false"` | Toggle WAF enforcement | | `security.rate_limit.enabled` | `"true"` / `"false"` | Toggle Rate Limiting | | `security.crowdsec.enabled` | `"true"` / `"false"` | Toggle CrowdSec | | `feature.cerberus.enabled` | `"true"` / `"false"` | Master toggle for all security | --- ## Implementation Plan ### File 1: `tests/utils/security-helpers.ts` (CREATE) ```typescript /** * Security Test Helpers - Safe ACL/WAF/Rate Limit toggle for E2E tests * * These helpers provide safe mechanisms to temporarily enable security features * during tests, with guaranteed cleanup even on test failure. * * Problem: If ACL is left enabled after a test failure, it blocks all API requests * causing subsequent tests to fail with 403 Forbidden (deadlock). * * Solution: Use Playwright's test.afterAll() with captured original state to * guarantee restoration regardless of test outcome. * * @example * ```typescript * import { withSecurityEnabled, getSecurityStatus } from './utils/security-helpers'; * * test.describe('ACL Tests', () => { * let cleanup: () => Promise; * * test.beforeAll(async ({ request }) => { * cleanup = await withSecurityEnabled(request, { acl: true }); * }); * * test.afterAll(async () => { * await cleanup(); * }); * * test('should enforce ACL', async ({ page }) => { * // ACL is now enabled, test enforcement * }); * }); * ``` */ import { APIRequestContext } from '@playwright/test'; /** * Security module status from GET /api/v1/security/status */ export interface SecurityStatus { cerberus: { enabled: boolean }; crowdsec: { mode: string; api_url: string; enabled: boolean }; waf: { mode: string; enabled: boolean }; rate_limit: { mode: string; enabled: boolean }; acl: { mode: string; enabled: boolean }; } /** * Options for enabling specific security modules */ export interface SecurityModuleOptions { /** Enable ACL enforcement */ acl?: boolean; /** Enable WAF protection */ waf?: boolean; /** Enable rate limiting */ rateLimit?: boolean; /** Enable CrowdSec */ crowdsec?: boolean; /** Enable master Cerberus toggle (required for other modules) */ cerberus?: boolean; } /** * Captured state for restoration */ export interface CapturedSecurityState { acl: boolean; waf: boolean; rateLimit: boolean; crowdsec: boolean; cerberus: boolean; } /** * Mapping of module names to their settings keys */ const SECURITY_SETTINGS_KEYS: Record = { acl: 'security.acl.enabled', waf: 'security.waf.enabled', rateLimit: 'security.rate_limit.enabled', crowdsec: 'security.crowdsec.enabled', cerberus: 'feature.cerberus.enabled', }; /** * Get current security status from the API * @param request - Playwright APIRequestContext (authenticated) * @returns Current security status */ export async function getSecurityStatus( request: APIRequestContext ): Promise { const response = await request.get('/api/v1/security/status'); if (!response.ok()) { throw new Error( `Failed to get security status: ${response.status()} ${await response.text()}` ); } return response.json(); } /** * Set a specific security module's enabled state * @param request - Playwright APIRequestContext (authenticated) * @param module - Which module to toggle * @param enabled - Whether to enable or disable */ export async function setSecurityModuleEnabled( request: APIRequestContext, module: keyof SecurityModuleOptions, enabled: boolean ): Promise { const key = SECURITY_SETTINGS_KEYS[module]; const value = enabled ? 'true' : 'false'; const response = await request.post('/api/v1/settings', { data: { key, value }, }); if (!response.ok()) { throw new Error( `Failed to set ${module} to ${enabled}: ${response.status()} ${await response.text()}` ); } // Wait a brief moment for Caddy config reload await new Promise((resolve) => setTimeout(resolve, 500)); } /** * Capture current security state for later restoration * @param request - Playwright APIRequestContext (authenticated) * @returns Captured state object */ export async function captureSecurityState( request: APIRequestContext ): Promise { const status = await getSecurityStatus(request); return { acl: status.acl.enabled, waf: status.waf.enabled, rateLimit: status.rate_limit.enabled, crowdsec: status.crowdsec.enabled, cerberus: status.cerberus.enabled, }; } /** * Restore security state to previously captured values * @param request - Playwright APIRequestContext (authenticated) * @param state - Previously captured state */ export async function restoreSecurityState( request: APIRequestContext, state: CapturedSecurityState ): Promise { const currentStatus = await getSecurityStatus(request); // Restore in reverse dependency order (features before master toggle) const modules: (keyof SecurityModuleOptions)[] = ['acl', 'waf', 'rateLimit', 'crowdsec', 'cerberus']; for (const module of modules) { const currentValue = module === 'rateLimit' ? currentStatus.rate_limit.enabled : module === 'crowdsec' ? currentStatus.crowdsec.enabled : currentStatus[module].enabled; if (currentValue !== state[module]) { await setSecurityModuleEnabled(request, module, state[module]); } } } /** * Enable security modules temporarily with guaranteed cleanup. * * Returns a cleanup function that MUST be called in test.afterAll(). * The cleanup function restores the original state even if tests fail. * * @param request - Playwright APIRequestContext (authenticated) * @param options - Which modules to enable * @returns Cleanup function to restore original state * * @example * ```typescript * test.describe('ACL Tests', () => { * let cleanup: () => Promise; * * test.beforeAll(async ({ request }) => { * cleanup = await withSecurityEnabled(request, { acl: true, cerberus: true }); * }); * * test.afterAll(async () => { * await cleanup(); * }); * }); * ``` */ export async function withSecurityEnabled( request: APIRequestContext, options: SecurityModuleOptions ): Promise<() => Promise> { // Capture original state BEFORE making any changes const originalState = await captureSecurityState(request); // Enable Cerberus first (master toggle) if any security module is requested const needsCerberus = options.acl || options.waf || options.rateLimit || options.crowdsec; if ((needsCerberus || options.cerberus) && !originalState.cerberus) { await setSecurityModuleEnabled(request, 'cerberus', true); } // Enable requested modules if (options.acl) { await setSecurityModuleEnabled(request, 'acl', true); } if (options.waf) { await setSecurityModuleEnabled(request, 'waf', true); } if (options.rateLimit) { await setSecurityModuleEnabled(request, 'rateLimit', true); } if (options.crowdsec) { await setSecurityModuleEnabled(request, 'crowdsec', true); } // Return cleanup function that restores original state return async () => { try { await restoreSecurityState(request, originalState); } catch (error) { // Log error but don't throw - cleanup should not fail tests console.error('Failed to restore security state:', error); // Try emergency disable of ACL to prevent deadlock try { await setSecurityModuleEnabled(request, 'acl', false); } catch { console.error('Emergency ACL disable also failed - manual intervention may be required'); } } }; } /** * Disable all security modules (emergency reset). * Use this in global-setup.ts or when tests need a clean slate. * * @param request - Playwright APIRequestContext (authenticated) */ export async function disableAllSecurityModules( request: APIRequestContext ): Promise { const modules: (keyof SecurityModuleOptions)[] = ['acl', 'waf', 'rateLimit', 'crowdsec']; for (const module of modules) { try { await setSecurityModuleEnabled(request, module, false); } catch (error) { console.warn(`Failed to disable ${module}:`, error); } } } /** * Check if ACL is currently blocking requests. * Useful for debugging test failures. * * @param request - Playwright APIRequestContext * @returns True if ACL is enabled and blocking */ export async function isAclBlocking(request: APIRequestContext): Promise { try { const status = await getSecurityStatus(request); return status.acl.enabled && status.cerberus.enabled; } catch { // If we can't get status, ACL might be blocking return true; } } ``` --- ### File 2: `tests/security/security-dashboard.spec.ts` (MODIFY) **Changes Required:** 1. Import the new security helpers 2. Add `test.beforeAll` to capture initial state 3. Add `test.afterAll` to guarantee cleanup 4. Remove redundant "toggle back" steps in individual tests 5. Group toggle tests in a separate describe block with isolated cleanup **Exact Changes:** ```typescript // ADD after existing imports (around line 12) import { withSecurityEnabled, captureSecurityState, restoreSecurityState, CapturedSecurityState, } from '../utils/security-helpers'; ``` ```typescript // REPLACE the entire 'Module Toggle Actions' describe block (lines ~80-180) // with this safer implementation: test.describe('Module Toggle Actions', () => { // Capture state ONCE for this describe block let originalState: CapturedSecurityState; let request: APIRequestContext; test.beforeAll(async ({ request: req }) => { request = req; originalState = await captureSecurityState(request); }); test.afterAll(async () => { // CRITICAL: Restore original state even if tests fail if (originalState) { await restoreSecurityState(request, originalState); } }); test('should toggle ACL enabled/disabled', async ({ page }) => { const toggle = page.getByTestId('toggle-acl'); const isDisabled = await toggle.isDisabled(); if (isDisabled) { test.info().annotations.push({ type: 'skip-reason', description: 'Toggle is disabled because Cerberus security is not enabled', }); test.skip(); return; } await test.step('Toggle ACL state', async () => { await page.waitForLoadState('networkidle'); await toggle.scrollIntoViewIfNeeded(); await page.waitForTimeout(200); await toggle.click({ force: true }); await waitForToast(page, /updated|success|enabled|disabled/i, 10000); }); // NOTE: Do NOT toggle back here - afterAll handles cleanup }); test('should toggle WAF enabled/disabled', async ({ page }) => { const toggle = page.getByTestId('toggle-waf'); const isDisabled = await toggle.isDisabled(); if (isDisabled) { test.info().annotations.push({ type: 'skip-reason', description: 'Toggle is disabled because Cerberus security is not enabled', }); test.skip(); return; } await test.step('Toggle WAF state', async () => { await page.waitForLoadState('networkidle'); await toggle.scrollIntoViewIfNeeded(); await page.waitForTimeout(200); await toggle.click({ force: true }); await waitForToast(page, /updated|success|enabled|disabled/i, 10000); }); // NOTE: Do NOT toggle back here - afterAll handles cleanup }); test('should toggle Rate Limiting enabled/disabled', async ({ page }) => { const toggle = page.getByTestId('toggle-rate-limit'); const isDisabled = await toggle.isDisabled(); if (isDisabled) { test.info().annotations.push({ type: 'skip-reason', description: 'Toggle is disabled because Cerberus security is not enabled', }); test.skip(); return; } await test.step('Toggle Rate Limit state', async () => { await page.waitForLoadState('networkidle'); await toggle.scrollIntoViewIfNeeded(); await page.waitForTimeout(200); await toggle.click({ force: true }); await waitForToast(page, /updated|success|enabled|disabled/i, 10000); }); // NOTE: Do NOT toggle back here - afterAll handles cleanup }); test('should persist toggle state after page reload', async ({ page }) => { const toggle = page.getByTestId('toggle-acl'); const isDisabled = await toggle.isDisabled(); if (isDisabled) { test.info().annotations.push({ type: 'skip-reason', description: 'Toggle is disabled because Cerberus security is not enabled', }); test.skip(); return; } const initialChecked = await toggle.isChecked(); await test.step('Toggle ACL state', async () => { await page.waitForLoadState('networkidle'); await toggle.scrollIntoViewIfNeeded(); await page.waitForTimeout(200); await toggle.click({ force: true }); await waitForToast(page, /updated|success|enabled|disabled/i, 10000); }); await test.step('Reload page', async () => { await page.reload(); await waitForLoadingComplete(page); }); await test.step('Verify state persisted', async () => { const newChecked = await page.getByTestId('toggle-acl').isChecked(); expect(newChecked).toBe(!initialChecked); }); // NOTE: Do NOT restore here - afterAll handles cleanup }); }); ``` --- ### File 3: `tests/global-setup.ts` (MODIFY) **Add Emergency Security Reset:** ```typescript // ADD to the end of the global setup function, before returning // Import at top of file import { request as playwrightRequest } from '@playwright/test'; import { existsSync, readFileSync } from 'fs'; import { STORAGE_STATE } from './constants'; // ADD in globalSetup function, after auth state is created: async function emergencySecurityReset(baseURL: string) { // Only run if auth state exists (meaning we can make authenticated requests) if (!existsSync(STORAGE_STATE)) { return; } try { const authenticatedContext = await playwrightRequest.newContext({ baseURL, storageState: STORAGE_STATE, }); // Disable ACL to prevent deadlock from previous failed runs await authenticatedContext.post('/api/v1/settings', { data: { key: 'security.acl.enabled', value: 'false' }, }); await authenticatedContext.dispose(); console.log('✓ Security reset: ACL disabled'); } catch (error) { console.warn('⚠️ Could not reset security state:', error); } } // Call at end of globalSetup: await emergencySecurityReset(process.env.PLAYWRIGHT_BASE_URL || 'http://localhost:8080'); ``` --- ### File 4: `tests/fixtures/auth-fixtures.ts` (OPTIONAL ENHANCEMENT) **Add security fixture for tests that need it:** ```typescript // ADD after existing imports import { withSecurityEnabled, SecurityModuleOptions, CapturedSecurityState, captureSecurityState, restoreSecurityState, } from '../utils/security-helpers'; // ADD to AuthFixtures interface interface AuthFixtures { // ... existing fixtures ... /** * Security state manager for tests that need to toggle security modules. * Automatically captures and restores state. */ securityState: { enable: (options: SecurityModuleOptions) => Promise; captured: CapturedSecurityState | null; }; } // ADD fixture definition in test.extend securityState: async ({ request }, use) => { let capturedState: CapturedSecurityState | null = null; const manager = { enable: async (options: SecurityModuleOptions) => { capturedState = await captureSecurityState(request); const cleanup = await withSecurityEnabled(request, options); // Store cleanup for afterAll manager._cleanup = cleanup; }, captured: capturedState, _cleanup: null as (() => Promise) | null, }; await use(manager); // Cleanup after test if (manager._cleanup) { await manager._cleanup(); } }, ``` --- ## Execution Checklist ### Phase 1: Create Helper Module - [ ] **1.1** Create `tests/utils/security-helpers.ts` with exact code from File 1 above - [ ] **1.2** Run TypeScript check: `npx tsc --noEmit` - [ ] **1.3** Verify helper imports correctly in a test file ### Phase 2: Update Security Dashboard Tests - [ ] **2.1** Add imports to `tests/security/security-dashboard.spec.ts` - [ ] **2.2** Replace 'Module Toggle Actions' describe block with new implementation - [ ] **2.3** Run affected tests: `npx playwright test security-dashboard --project=chromium` - [ ] **2.4** Verify tests pass AND cleanup happens (check security status after) ### Phase 3: Add Global Safety Net - [ ] **3.1** Update `tests/global-setup.ts` with emergency security reset - [ ] **3.2** Run full test suite: `npx playwright test --project=chromium` - [ ] **3.3** Verify no ACL deadlock occurs across multiple runs ### Phase 4: Validation - [ ] **4.1** Force a test failure (e.g., add `throw new Error()`) and verify cleanup still runs - [ ] **4.2** Check security status after failed test: `curl localhost:8080/api/v1/security/status` - [ ] **4.3** Confirm ACL is disabled after cleanup - [ ] **4.4** Run full E2E suite 3 times consecutively to verify stability --- ## Benefits 1. **No deadlock**: Tests can safely enable/disable ACL with guaranteed cleanup 2. **Cleanup guaranteed**: `test.afterAll` runs even on failure 3. **Realistic testing**: ACL tests use the same toggle mechanism as users 4. **Isolation**: Other tests unaffected by ACL state 5. **Global safety net**: Even if individual cleanup fails, global setup resets state ## Risk Mitigation | Risk | Mitigation | |------|------------| | Cleanup fails due to API error | Emergency fallback disables ACL specifically | | Global setup can't reset state | Auth state file check prevents errors | | Tests run in parallel | Each describe block has its own captured state | | API changes break helpers | Settings keys are centralized in one const | ## Files Summary | File | Action | Priority | |------|--------|----------| | `tests/utils/security-helpers.ts` | **CREATE** | Critical | | `tests/security/security-dashboard.spec.ts` | **MODIFY** | Critical | | `tests/global-setup.ts` | **MODIFY** | High | | `tests/fixtures/auth-fixtures.ts` | **MODIFY** (Optional) | Low |