Fix integration scripts using wget-style curl options after Alpine→Debian migration (PR #550). Add Playwright security test helpers to prevent ACL from blocking subsequent tests. Fix curl syntax in 5 scripts: -q -O- → -sf Create security-helpers.ts with state capture/restore Add emergency ACL reset to global-setup.ts Fix fixture reuse bug in security-dashboard.spec.ts Add security-helpers.md usage guide Resolves WAF workflow "httpbin backend failed to start" error
1238 lines
35 KiB
Markdown
1238 lines
35 KiB
Markdown
# WAF Integration Workflow Fix: wget-style curl Syntax Migration
|
||
|
||
**Plan ID**: WAF-2026-001
|
||
**Status**: 📋 PENDING
|
||
**Priority**: High
|
||
**Created**: 2026-01-25
|
||
**Scope**: Fix integration test scripts using incorrect wget-style curl syntax
|
||
|
||
---
|
||
|
||
## Problem Summary
|
||
|
||
After migrating the Docker base image from Alpine to Debian Trixie (PR #550), the WAF integration workflow is failing. The root cause is **not** a missing `wget` command, but rather several integration test scripts using **wget-style options with curl** that don't work correctly.
|
||
|
||
### Root Cause
|
||
|
||
Multiple scripts use `curl -q -O-` which is **wget syntax, not curl syntax**:
|
||
|
||
| Syntax | Tool | Meaning |
|
||
|--------|------|---------|
|
||
| `-q` | **wget** | Quiet mode |
|
||
| `-q` | **curl** | **Invalid** - does nothing useful |
|
||
| `-O-` | **wget** | Output to stdout |
|
||
| `-O-` | **curl** | **Wrong** - `-O` means "save with remote filename", `-` is treated as a separate URL |
|
||
|
||
The correct curl equivalents are:
|
||
| wget | curl | Notes |
|
||
|------|------|-------|
|
||
| `wget -q` | `curl -s` | Silent mode |
|
||
| `wget -O-` | `curl -s` | stdout is curl's default output |
|
||
| `wget -q -O- URL` | `curl -s URL` | Full equivalent |
|
||
| `wget -O filename` | `curl -o filename` | Note: lowercase `-o` in curl |
|
||
|
||
---
|
||
|
||
## Files Requiring Changes
|
||
|
||
### Priority 1: Integration Test Scripts (Blocking WAF Workflow)
|
||
|
||
| File | Line | Current Code | Issue |
|
||
|------|------|--------------|-------|
|
||
| [scripts/waf_integration.sh](../../scripts/waf_integration.sh#L205) | 205 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax |
|
||
| [scripts/cerberus_integration.sh](../../scripts/cerberus_integration.sh#L214) | 214 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax |
|
||
| [scripts/rate_limit_integration.sh](../../scripts/rate_limit_integration.sh#L190) | 190 | `curl -q -O- http://${BACKEND_CONTAINER}/get` | wget syntax |
|
||
| [scripts/crowdsec_startup_test.sh](../../scripts/crowdsec_startup_test.sh#L178) | 178 | `curl -q -O- http://127.0.0.1:8085/health` | wget syntax |
|
||
|
||
### Priority 2: Utility Scripts
|
||
|
||
| File | Line | Current Code | Issue |
|
||
|------|------|--------------|-------|
|
||
| [scripts/install-go-1.25.5.sh](../../scripts/install-go-1.25.5.sh#L18) | 18 | `curl -q -O "$TMPFILE" "URL"` | Wrong syntax - `-O` doesn't take an argument in curl |
|
||
|
||
---
|
||
|
||
## Detailed Fixes
|
||
|
||
### Fix 1: scripts/waf_integration.sh (Line 205)
|
||
|
||
**Current (broken):**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
**Fixed:**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
**Notes:**
|
||
- `-s` = silent (no progress meter)
|
||
- `-f` = fail silently on HTTP errors (returns non-zero exit code)
|
||
- Removed redundant fallback since the fix makes the command work correctly
|
||
|
||
---
|
||
|
||
### Fix 2: scripts/cerberus_integration.sh (Line 214)
|
||
|
||
**Current (broken):**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
**Fixed:**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
---
|
||
|
||
### Fix 3: scripts/rate_limit_integration.sh (Line 190)
|
||
|
||
**Current (broken):**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -q -O- http://${BACKEND_CONTAINER}/get 2>/dev/null || curl -s http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
**Fixed:**
|
||
```bash
|
||
if docker exec ${CONTAINER_NAME} sh -c "curl -sf http://${BACKEND_CONTAINER}/get" >/dev/null 2>&1; then
|
||
```
|
||
|
||
---
|
||
|
||
### Fix 4: scripts/crowdsec_startup_test.sh (Line 178)
|
||
|
||
**Current (broken):**
|
||
```bash
|
||
LAPI_HEALTH=$(docker exec ${CONTAINER_NAME} curl -q -O- http://127.0.0.1:8085/health 2>/dev/null || echo "FAILED")
|
||
```
|
||
|
||
**Fixed:**
|
||
```bash
|
||
LAPI_HEALTH=$(docker exec ${CONTAINER_NAME} curl -sf http://127.0.0.1:8085/health 2>/dev/null || echo "FAILED")
|
||
```
|
||
|
||
---
|
||
|
||
### Fix 5: scripts/install-go-1.25.5.sh (Line 18)
|
||
|
||
**Current (broken):**
|
||
```bash
|
||
curl -q -O "$TMPFILE" "https://go.dev/dl/${TARFILE}"
|
||
```
|
||
|
||
**Fixed:**
|
||
```bash
|
||
curl -sSfL -o "$TMPFILE" "https://go.dev/dl/${TARFILE}"
|
||
```
|
||
|
||
**Notes:**
|
||
- `-s` = silent
|
||
- `-S` = show errors even in silent mode
|
||
- `-f` = fail on HTTP errors
|
||
- `-L` = follow redirects (important for go.dev downloads)
|
||
- `-o filename` = output to specified file (lowercase `-o`)
|
||
|
||
---
|
||
|
||
## Verification Commands
|
||
|
||
After applying fixes, verify each script works:
|
||
|
||
```bash
|
||
# Test WAF integration
|
||
./scripts/waf_integration.sh
|
||
|
||
# Test Cerberus integration
|
||
./scripts/cerberus_integration.sh
|
||
|
||
# Test Rate Limit integration
|
||
./scripts/rate_limit_integration.sh
|
||
|
||
# Test CrowdSec startup
|
||
./scripts/crowdsec_startup_test.sh
|
||
|
||
# Verify Go install script syntax
|
||
bash -n ./scripts/install-go-1.25.5.sh
|
||
```
|
||
|
||
---
|
||
|
||
## Behavior Differences: wget vs curl
|
||
|
||
When migrating from wget to curl, be aware of these differences:
|
||
|
||
| Behavior | wget | curl |
|
||
|----------|------|------|
|
||
| Output destination | File by default | stdout by default |
|
||
| Follow redirects | Yes by default | Requires `-L` flag |
|
||
| Retry on failure | Built-in retry | Requires `--retry N` |
|
||
| Progress display | Text progress bar | Progress meter (use `-s` to hide) |
|
||
| HTTP error handling | Non-zero exit on 404 | Requires `-f` for non-zero exit on HTTP errors |
|
||
| Quiet mode | `-q` | `-s` (silent) |
|
||
| Output to file | `-O filename` (uppercase) | `-o filename` (lowercase) |
|
||
| Save with remote name | `-O` (no arg) | `-O` (uppercase, no arg) |
|
||
|
||
---
|
||
|
||
## Execution Checklist
|
||
|
||
- [ ] **Fix 1**: Update `scripts/waf_integration.sh` line 205
|
||
- [ ] **Fix 2**: Update `scripts/cerberus_integration.sh` line 214
|
||
- [ ] **Fix 3**: Update `scripts/rate_limit_integration.sh` line 190
|
||
- [ ] **Fix 4**: Update `scripts/crowdsec_startup_test.sh` line 178
|
||
- [ ] **Fix 5**: Update `scripts/install-go-1.25.5.sh` line 18
|
||
- [ ] **Verify**: Run each integration test locally
|
||
- [ ] **CI**: Confirm WAF integration workflow passes
|
||
|
||
---
|
||
|
||
## Notes
|
||
|
||
1. **Deprecated Scripts**: Several affected scripts are marked deprecated (will be removed in v2.0.0). However, they are still used by CI workflows, so fixes are required.
|
||
|
||
2. **Skill-Based Replacements**: The `.github/skills/scripts/` directory was checked and contains no wget usage - those scripts already use correct curl syntax.
|
||
|
||
3. **Docker Compose Files**: All health checks in docker-compose files already use correct curl syntax (`curl -f`, `curl -fsS`).
|
||
|
||
4. **Dockerfile**: The main Dockerfile correctly installs `curl` and uses correct curl syntax in the HEALTHCHECK instruction.
|
||
|
||
---
|
||
|
||
# Previous Plan (Archived)
|
||
|
||
The previous Git & Workflow Recovery Plan has been archived below.
|
||
|
||
---
|
||
|
||
# Git & Workflow Recovery Plan (ARCHIVED)
|
||
|
||
**Plan ID**: GIT-2026-001
|
||
**Status**: ✅ ARCHIVED
|
||
**Priority**: High
|
||
**Created**: 2026-01-25
|
||
**Scope**: Git recovery, Renovate fix, Workflow simplification
|
||
|
||
---
|
||
|
||
## Problem Summary
|
||
|
||
1. **Git State**: Feature branch `feature/beta-release` is in a broken rebase state
|
||
2. **Renovate**: Targeting feature branches creates orphaned PRs and merge conflicts
|
||
3. **Propagate Workflow**: Overly complex cascade (`main → development → nightly → feature/*`) causes confusion
|
||
4. **Nightly Branch**: Unnecessary intermediate branch adding complexity
|
||
|
||
---
|
||
|
||
## Phase 1: Git Recovery
|
||
|
||
### Step 1.1 — Abort the Rebase
|
||
|
||
```bash
|
||
# Check current state
|
||
git status
|
||
|
||
# Abort the in-progress rebase
|
||
git rebase --abort
|
||
|
||
# Verify clean state
|
||
git status
|
||
```
|
||
|
||
### Step 1.2 — Fetch Latest from Origin
|
||
|
||
```bash
|
||
# Fetch all branches
|
||
git fetch origin --prune
|
||
|
||
# Ensure we're on the feature branch
|
||
git checkout feature/beta-release
|
||
```
|
||
|
||
### Step 1.3 — Merge Development into Feature Branch
|
||
|
||
**Use merge, NOT rebase** to preserve commit history and avoid force-push issues.
|
||
|
||
```bash
|
||
# Merge development into feature/beta-release
|
||
git merge origin/development --no-ff -m "Merge development into feature/beta-release"
|
||
```
|
||
|
||
### Step 1.4 — Resolve Conflicts (if any)
|
||
|
||
Likely conflict files based on Renovate activity:
|
||
- `package.json` / `package-lock.json` (version bumps)
|
||
- `backend/go.mod` / `backend/go.sum` (Go dependency updates)
|
||
- `.github/workflows/*.yml` (action digest pins)
|
||
|
||
**Resolution strategy:**
|
||
```bash
|
||
# For package.json - accept development's versions, then run npm install
|
||
git checkout --theirs package.json package-lock.json
|
||
npm install
|
||
git add package.json package-lock.json
|
||
|
||
# For go.mod/go.sum - accept development's versions, then tidy
|
||
git checkout --theirs backend/go.mod backend/go.sum
|
||
cd backend && go mod tidy && cd ..
|
||
git add backend/go.mod backend/go.sum
|
||
|
||
# For workflow files - usually safe to accept development
|
||
git checkout --theirs .github/workflows/
|
||
|
||
# Complete the merge
|
||
git commit
|
||
```
|
||
|
||
### Step 1.5 — Push the Merged Branch
|
||
|
||
```bash
|
||
git push origin feature/beta-release
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 2: Renovate Fix
|
||
|
||
### Problem
|
||
|
||
Current config in `.github/renovate.json`:
|
||
```json
|
||
"baseBranches": [
|
||
"development",
|
||
"feature/beta-release"
|
||
]
|
||
```
|
||
|
||
This causes:
|
||
- Duplicate PRs for the same dependency (one per branch)
|
||
- Orphaned branches like `renovate/feature/beta-release-*` when feature merges
|
||
- Constant merge conflicts between branches
|
||
|
||
### Solution
|
||
|
||
Only target `development`. Changes flow naturally via propagate workflow.
|
||
|
||
### Old Config (REMOVE)
|
||
|
||
```json
|
||
{
|
||
"baseBranches": [
|
||
"development",
|
||
"feature/beta-release"
|
||
],
|
||
...
|
||
}
|
||
```
|
||
|
||
### New Config (REPLACE WITH)
|
||
|
||
```json
|
||
{
|
||
"baseBranches": [
|
||
"development"
|
||
],
|
||
...
|
||
}
|
||
```
|
||
|
||
### File to Edit
|
||
|
||
**File**: `.github/renovate.json`
|
||
**Line**: ~12-15
|
||
|
||
---
|
||
|
||
## Phase 3: Propagate Workflow Fix
|
||
|
||
### Problem
|
||
|
||
Current workflow in `.github/workflows/propagate-changes.yml`:
|
||
|
||
```yaml
|
||
on:
|
||
push:
|
||
branches:
|
||
- main
|
||
- development
|
||
- nightly # <-- Unnecessary
|
||
```
|
||
|
||
Cascade logic:
|
||
- `main` → `development` ✅ (Correct)
|
||
- `development` → `nightly` ❌ (Unnecessary)
|
||
- `nightly` → `feature/*` ❌ (Overly complex)
|
||
|
||
### Solution
|
||
|
||
Simplify to **only** `main → development` propagation.
|
||
|
||
### Old Trigger (REMOVE)
|
||
|
||
```yaml
|
||
on:
|
||
push:
|
||
branches:
|
||
- main
|
||
- development
|
||
- nightly
|
||
```
|
||
|
||
### New Trigger (REPLACE WITH)
|
||
|
||
```yaml
|
||
on:
|
||
push:
|
||
branches:
|
||
- main
|
||
```
|
||
|
||
### Old Script Logic (REMOVE)
|
||
|
||
```javascript
|
||
if (currentBranch === 'main') {
|
||
// Main -> Development
|
||
await createPR('main', 'development');
|
||
} else if (currentBranch === 'development') {
|
||
// Development -> Nightly
|
||
await createPR('development', 'nightly');
|
||
} else if (currentBranch === 'nightly') {
|
||
// Nightly -> Feature branches
|
||
const branches = await github.paginate(github.rest.repos.listBranches, {
|
||
owner: context.repo.owner,
|
||
repo: context.repo.repo,
|
||
});
|
||
|
||
const featureBranches = branches
|
||
.map(b => b.name)
|
||
.filter(name => name.startsWith('feature/'));
|
||
|
||
core.info(`Found ${featureBranches.length} feature branches: ${featureBranches.join(', ')}`);
|
||
|
||
for (const featureBranch of featureBranches) {
|
||
await createPR('development', featureBranch);
|
||
}
|
||
}
|
||
```
|
||
|
||
### New Script Logic (REPLACE WITH)
|
||
|
||
```javascript
|
||
if (currentBranch === 'main') {
|
||
// Main -> Development (only propagation needed)
|
||
await createPR('main', 'development');
|
||
}
|
||
```
|
||
|
||
### File to Edit
|
||
|
||
**File**: `.github/workflows/propagate-changes.yml`
|
||
|
||
---
|
||
|
||
## Phase 4: Cleanup
|
||
|
||
### Step 4.1 — Delete Nightly Branch
|
||
|
||
```bash
|
||
# Delete remote nightly branch (if exists)
|
||
git push origin --delete nightly 2>/dev/null || echo "nightly branch does not exist"
|
||
|
||
# Delete local tracking branch
|
||
git branch -D nightly 2>/dev/null || true
|
||
```
|
||
|
||
### Step 4.2 — Delete Orphaned Renovate Branches
|
||
|
||
```bash
|
||
# List all renovate branches targeting feature/beta-release
|
||
git fetch origin
|
||
git branch -r | grep 'renovate/feature/beta-release' | while read branch; do
|
||
remote_branch="${branch#origin/}"
|
||
echo "Deleting: $remote_branch"
|
||
git push origin --delete "$remote_branch"
|
||
done
|
||
```
|
||
|
||
### Step 4.3 — Close Orphaned Renovate PRs
|
||
|
||
After branches are deleted, any associated PRs will be automatically closed by GitHub.
|
||
|
||
---
|
||
|
||
## Execution Checklist
|
||
|
||
- [ ] **Phase 1**: Git Recovery
|
||
- [ ] 1.1 Abort rebase
|
||
- [ ] 1.2 Fetch latest
|
||
- [ ] 1.3 Merge development
|
||
- [ ] 1.4 Resolve conflicts
|
||
- [ ] 1.5 Push merged branch
|
||
|
||
- [ ] **Phase 2**: Renovate Fix
|
||
- [ ] Edit `.github/renovate.json` - remove `feature/beta-release` from baseBranches
|
||
- [ ] Commit and push
|
||
|
||
- [ ] **Phase 3**: Propagate Workflow Fix
|
||
- [ ] Edit `.github/workflows/propagate-changes.yml` - simplify triggers and logic
|
||
- [ ] Commit and push
|
||
|
||
- [ ] **Phase 4**: Cleanup
|
||
- [ ] 4.1 Delete nightly branch
|
||
- [ ] 4.2 Delete orphaned `renovate/feature/beta-release-*` branches
|
||
- [ ] 4.3 Verify orphaned PRs are closed
|
||
|
||
---
|
||
|
||
## Verification
|
||
|
||
After all phases complete:
|
||
|
||
```bash
|
||
# Confirm no rebase in progress
|
||
git status
|
||
# Expected: "On branch feature/beta-release" with clean state
|
||
|
||
# Confirm nightly deleted
|
||
git branch -r | grep nightly
|
||
# Expected: no output
|
||
|
||
# Confirm orphaned renovate branches deleted
|
||
git branch -r | grep 'renovate/feature/beta-release'
|
||
# Expected: no output
|
||
|
||
# Confirm Renovate config only targets development
|
||
cat .github/renovate.json | grep -A2 baseBranches
|
||
# Expected: only "development"
|
||
```
|
||
|
||
---
|
||
|
||
## Rollback Plan
|
||
|
||
If issues occur:
|
||
|
||
1. **Git Recovery Failed**:
|
||
```bash
|
||
git fetch origin
|
||
git checkout feature/beta-release
|
||
git reset --hard origin/feature/beta-release
|
||
```
|
||
|
||
2. **Renovate Changes Broke Something**: Revert the commit to `.github/renovate.json`
|
||
|
||
3. **Propagate Workflow Issues**: Revert the commit to `.github/workflows/propagate-changes.yml`
|
||
|
||
---
|
||
|
||
## Archived Spec (Prior Implementation)
|
||
|
||
# Security Fix: Remove Hardcoded Encryption Keys from Docker Compose Files
|
||
|
||
**Plan ID**: SEC-2026-001
|
||
**Status**: ✅ IMPLEMENTED
|
||
**Priority**: Critical (Security)
|
||
**Created**: 2026-01-25
|
||
**Implemented By**: Management Agent
|
||
|
||
---
|
||
|
||
### Summary
|
||
|
||
Removed hardcoded encryption keys from Docker Compose test files and implemented ephemeral key generation in CI workflows.
|
||
|
||
### Changes Applied
|
||
|
||
| File | Change |
|
||
|------|--------|
|
||
| `.docker/compose/docker-compose.playwright.yml` | Replaced hardcoded key with `${CHARON_ENCRYPTION_KEY:?...}` |
|
||
| `.docker/compose/docker-compose.e2e.yml` | Replaced hardcoded key with `${CHARON_ENCRYPTION_KEY:?...}` |
|
||
| `.github/workflows/e2e-tests.yml` | Added ephemeral key generation step |
|
||
| `.env.test.example` | Added prominent documentation |
|
||
|
||
### Security Notes
|
||
|
||
- The old key `ucDWy5ScLubd3QwCHhQa2SY7wL2OF48p/c9nZhyW1mA=` exists in git history
|
||
- This key should **NEVER** be used in any production environment
|
||
- Each CI run now generates a unique ephemeral key
|
||
|
||
### Testing
|
||
|
||
```bash
|
||
# Verify compose fails without key
|
||
unset CHARON_ENCRYPTION_KEY
|
||
docker compose -f .docker/compose/docker-compose.playwright.yml config 2>&1
|
||
# Expected: "CHARON_ENCRYPTION_KEY is required"
|
||
|
||
# Verify compose succeeds with key
|
||
export CHARON_ENCRYPTION_KEY=$(openssl rand -base64 32)
|
||
docker compose -f .docker/compose/docker-compose.playwright.yml config
|
||
# Expected: Valid YAML output
|
||
```
|
||
|
||
### References
|
||
|
||
- **OWASP**: [A02:2021 – Cryptographic Failures](https://owasp.org/Top10/A02_2021-Cryptographic_Failures/)
|
||
|
||
---
|
||
|
||
# Playwright Security Test Helpers
|
||
|
||
**Plan ID**: E2E-SEC-001
|
||
**Status**: ✅ COMPLETED
|
||
**Priority**: Critical (Blocking 230/707 E2E test failures)
|
||
**Created**: 2026-01-25
|
||
**Completed**: 2026-01-25
|
||
**Scope**: Add security test helpers to prevent ACL deadlock in E2E tests
|
||
|
||
---
|
||
|
||
## Completion Notes
|
||
|
||
**Implementation Summary:**
|
||
- Created `tests/utils/security-helpers.ts` with full security state management utilities
|
||
- Functions implemented: `getSecurityStatus`, `setSecurityModuleEnabled`, `captureSecurityState`, `restoreSecurityState`, `withSecurityEnabled`, `disableAllSecurityModules`
|
||
- Pattern enables guaranteed cleanup via Playwright's `test.afterAll()` fixture
|
||
|
||
**Documentation:**
|
||
- See [Security Test Helpers Guide](../testing/security-helpers.md) for usage examples
|
||
|
||
---
|
||
|
||
## Problem Summary
|
||
|
||
During E2E testing, if ACL is left enabled from a previous test run (e.g., due to test failure), it can create a **deadlock**:
|
||
1. ACL blocks API requests → returns 403 Forbidden
|
||
2. Global cleanup can't run → API blocked
|
||
3. Auth setup fails → tests skip
|
||
4. Manual intervention required to reset volumes
|
||
|
||
**Root Cause Analysis:**
|
||
- `security-dashboard.spec.ts` has tests that toggle ACL, WAF, and Rate Limiting
|
||
- The tests attempt to "toggle back" but if a test fails mid-execution, cleanup doesn't run
|
||
- Playwright's `test.afterAll` with fixtures guarantees cleanup even on failure
|
||
- The current tests don't use fixtures for security state management
|
||
|
||
## Solution Architecture
|
||
|
||
### API Endpoints (Backend Already Supports)
|
||
|
||
| Endpoint | Method | Purpose |
|
||
|----------|--------|---------|
|
||
| `/api/v1/security/status` | GET | Returns current state of all security modules |
|
||
| `/api/v1/settings` | POST | Toggle settings with `{ key: "security.acl.enabled", value: "true/false" }` |
|
||
|
||
### Settings Keys
|
||
|
||
| Key | Values | Description |
|
||
|-----|--------|-------------|
|
||
| `security.acl.enabled` | `"true"` / `"false"` | Toggle ACL enforcement |
|
||
| `security.waf.enabled` | `"true"` / `"false"` | Toggle WAF enforcement |
|
||
| `security.rate_limit.enabled` | `"true"` / `"false"` | Toggle Rate Limiting |
|
||
| `security.crowdsec.enabled` | `"true"` / `"false"` | Toggle CrowdSec |
|
||
| `feature.cerberus.enabled` | `"true"` / `"false"` | Master toggle for all security |
|
||
|
||
---
|
||
|
||
## Implementation Plan
|
||
|
||
### File 1: `tests/utils/security-helpers.ts` (CREATE)
|
||
|
||
```typescript
|
||
/**
|
||
* Security Test Helpers - Safe ACL/WAF/Rate Limit toggle for E2E tests
|
||
*
|
||
* These helpers provide safe mechanisms to temporarily enable security features
|
||
* during tests, with guaranteed cleanup even on test failure.
|
||
*
|
||
* Problem: If ACL is left enabled after a test failure, it blocks all API requests
|
||
* causing subsequent tests to fail with 403 Forbidden (deadlock).
|
||
*
|
||
* Solution: Use Playwright's test.afterAll() with captured original state to
|
||
* guarantee restoration regardless of test outcome.
|
||
*
|
||
* @example
|
||
* ```typescript
|
||
* import { withSecurityEnabled, getSecurityStatus } from './utils/security-helpers';
|
||
*
|
||
* test.describe('ACL Tests', () => {
|
||
* let cleanup: () => Promise<void>;
|
||
*
|
||
* test.beforeAll(async ({ request }) => {
|
||
* cleanup = await withSecurityEnabled(request, { acl: true });
|
||
* });
|
||
*
|
||
* test.afterAll(async () => {
|
||
* await cleanup();
|
||
* });
|
||
*
|
||
* test('should enforce ACL', async ({ page }) => {
|
||
* // ACL is now enabled, test enforcement
|
||
* });
|
||
* });
|
||
* ```
|
||
*/
|
||
|
||
import { APIRequestContext } from '@playwright/test';
|
||
|
||
/**
|
||
* Security module status from GET /api/v1/security/status
|
||
*/
|
||
export interface SecurityStatus {
|
||
cerberus: { enabled: boolean };
|
||
crowdsec: { mode: string; api_url: string; enabled: boolean };
|
||
waf: { mode: string; enabled: boolean };
|
||
rate_limit: { mode: string; enabled: boolean };
|
||
acl: { mode: string; enabled: boolean };
|
||
}
|
||
|
||
/**
|
||
* Options for enabling specific security modules
|
||
*/
|
||
export interface SecurityModuleOptions {
|
||
/** Enable ACL enforcement */
|
||
acl?: boolean;
|
||
/** Enable WAF protection */
|
||
waf?: boolean;
|
||
/** Enable rate limiting */
|
||
rateLimit?: boolean;
|
||
/** Enable CrowdSec */
|
||
crowdsec?: boolean;
|
||
/** Enable master Cerberus toggle (required for other modules) */
|
||
cerberus?: boolean;
|
||
}
|
||
|
||
/**
|
||
* Captured state for restoration
|
||
*/
|
||
export interface CapturedSecurityState {
|
||
acl: boolean;
|
||
waf: boolean;
|
||
rateLimit: boolean;
|
||
crowdsec: boolean;
|
||
cerberus: boolean;
|
||
}
|
||
|
||
/**
|
||
* Mapping of module names to their settings keys
|
||
*/
|
||
const SECURITY_SETTINGS_KEYS: Record<keyof SecurityModuleOptions, string> = {
|
||
acl: 'security.acl.enabled',
|
||
waf: 'security.waf.enabled',
|
||
rateLimit: 'security.rate_limit.enabled',
|
||
crowdsec: 'security.crowdsec.enabled',
|
||
cerberus: 'feature.cerberus.enabled',
|
||
};
|
||
|
||
/**
|
||
* Get current security status from the API
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
* @returns Current security status
|
||
*/
|
||
export async function getSecurityStatus(
|
||
request: APIRequestContext
|
||
): Promise<SecurityStatus> {
|
||
const response = await request.get('/api/v1/security/status');
|
||
|
||
if (!response.ok()) {
|
||
throw new Error(
|
||
`Failed to get security status: ${response.status()} ${await response.text()}`
|
||
);
|
||
}
|
||
|
||
return response.json();
|
||
}
|
||
|
||
/**
|
||
* Set a specific security module's enabled state
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
* @param module - Which module to toggle
|
||
* @param enabled - Whether to enable or disable
|
||
*/
|
||
export async function setSecurityModuleEnabled(
|
||
request: APIRequestContext,
|
||
module: keyof SecurityModuleOptions,
|
||
enabled: boolean
|
||
): Promise<void> {
|
||
const key = SECURITY_SETTINGS_KEYS[module];
|
||
const value = enabled ? 'true' : 'false';
|
||
|
||
const response = await request.post('/api/v1/settings', {
|
||
data: { key, value },
|
||
});
|
||
|
||
if (!response.ok()) {
|
||
throw new Error(
|
||
`Failed to set ${module} to ${enabled}: ${response.status()} ${await response.text()}`
|
||
);
|
||
}
|
||
|
||
// Wait a brief moment for Caddy config reload
|
||
await new Promise((resolve) => setTimeout(resolve, 500));
|
||
}
|
||
|
||
/**
|
||
* Capture current security state for later restoration
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
* @returns Captured state object
|
||
*/
|
||
export async function captureSecurityState(
|
||
request: APIRequestContext
|
||
): Promise<CapturedSecurityState> {
|
||
const status = await getSecurityStatus(request);
|
||
|
||
return {
|
||
acl: status.acl.enabled,
|
||
waf: status.waf.enabled,
|
||
rateLimit: status.rate_limit.enabled,
|
||
crowdsec: status.crowdsec.enabled,
|
||
cerberus: status.cerberus.enabled,
|
||
};
|
||
}
|
||
|
||
/**
|
||
* Restore security state to previously captured values
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
* @param state - Previously captured state
|
||
*/
|
||
export async function restoreSecurityState(
|
||
request: APIRequestContext,
|
||
state: CapturedSecurityState
|
||
): Promise<void> {
|
||
const currentStatus = await getSecurityStatus(request);
|
||
|
||
// Restore in reverse dependency order (features before master toggle)
|
||
const modules: (keyof SecurityModuleOptions)[] = ['acl', 'waf', 'rateLimit', 'crowdsec', 'cerberus'];
|
||
|
||
for (const module of modules) {
|
||
const currentValue = module === 'rateLimit'
|
||
? currentStatus.rate_limit.enabled
|
||
: module === 'crowdsec'
|
||
? currentStatus.crowdsec.enabled
|
||
: currentStatus[module].enabled;
|
||
|
||
if (currentValue !== state[module]) {
|
||
await setSecurityModuleEnabled(request, module, state[module]);
|
||
}
|
||
}
|
||
}
|
||
|
||
/**
|
||
* Enable security modules temporarily with guaranteed cleanup.
|
||
*
|
||
* Returns a cleanup function that MUST be called in test.afterAll().
|
||
* The cleanup function restores the original state even if tests fail.
|
||
*
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
* @param options - Which modules to enable
|
||
* @returns Cleanup function to restore original state
|
||
*
|
||
* @example
|
||
* ```typescript
|
||
* test.describe('ACL Tests', () => {
|
||
* let cleanup: () => Promise<void>;
|
||
*
|
||
* test.beforeAll(async ({ request }) => {
|
||
* cleanup = await withSecurityEnabled(request, { acl: true, cerberus: true });
|
||
* });
|
||
*
|
||
* test.afterAll(async () => {
|
||
* await cleanup();
|
||
* });
|
||
* });
|
||
* ```
|
||
*/
|
||
export async function withSecurityEnabled(
|
||
request: APIRequestContext,
|
||
options: SecurityModuleOptions
|
||
): Promise<() => Promise<void>> {
|
||
// Capture original state BEFORE making any changes
|
||
const originalState = await captureSecurityState(request);
|
||
|
||
// Enable Cerberus first (master toggle) if any security module is requested
|
||
const needsCerberus = options.acl || options.waf || options.rateLimit || options.crowdsec;
|
||
if ((needsCerberus || options.cerberus) && !originalState.cerberus) {
|
||
await setSecurityModuleEnabled(request, 'cerberus', true);
|
||
}
|
||
|
||
// Enable requested modules
|
||
if (options.acl) {
|
||
await setSecurityModuleEnabled(request, 'acl', true);
|
||
}
|
||
if (options.waf) {
|
||
await setSecurityModuleEnabled(request, 'waf', true);
|
||
}
|
||
if (options.rateLimit) {
|
||
await setSecurityModuleEnabled(request, 'rateLimit', true);
|
||
}
|
||
if (options.crowdsec) {
|
||
await setSecurityModuleEnabled(request, 'crowdsec', true);
|
||
}
|
||
|
||
// Return cleanup function that restores original state
|
||
return async () => {
|
||
try {
|
||
await restoreSecurityState(request, originalState);
|
||
} catch (error) {
|
||
// Log error but don't throw - cleanup should not fail tests
|
||
console.error('Failed to restore security state:', error);
|
||
// Try emergency disable of ACL to prevent deadlock
|
||
try {
|
||
await setSecurityModuleEnabled(request, 'acl', false);
|
||
} catch {
|
||
console.error('Emergency ACL disable also failed - manual intervention may be required');
|
||
}
|
||
}
|
||
};
|
||
}
|
||
|
||
/**
|
||
* Disable all security modules (emergency reset).
|
||
* Use this in global-setup.ts or when tests need a clean slate.
|
||
*
|
||
* @param request - Playwright APIRequestContext (authenticated)
|
||
*/
|
||
export async function disableAllSecurityModules(
|
||
request: APIRequestContext
|
||
): Promise<void> {
|
||
const modules: (keyof SecurityModuleOptions)[] = ['acl', 'waf', 'rateLimit', 'crowdsec'];
|
||
|
||
for (const module of modules) {
|
||
try {
|
||
await setSecurityModuleEnabled(request, module, false);
|
||
} catch (error) {
|
||
console.warn(`Failed to disable ${module}:`, error);
|
||
}
|
||
}
|
||
}
|
||
|
||
/**
|
||
* Check if ACL is currently blocking requests.
|
||
* Useful for debugging test failures.
|
||
*
|
||
* @param request - Playwright APIRequestContext
|
||
* @returns True if ACL is enabled and blocking
|
||
*/
|
||
export async function isAclBlocking(request: APIRequestContext): Promise<boolean> {
|
||
try {
|
||
const status = await getSecurityStatus(request);
|
||
return status.acl.enabled && status.cerberus.enabled;
|
||
} catch {
|
||
// If we can't get status, ACL might be blocking
|
||
return true;
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### File 2: `tests/security/security-dashboard.spec.ts` (MODIFY)
|
||
|
||
**Changes Required:**
|
||
|
||
1. Import the new security helpers
|
||
2. Add `test.beforeAll` to capture initial state
|
||
3. Add `test.afterAll` to guarantee cleanup
|
||
4. Remove redundant "toggle back" steps in individual tests
|
||
5. Group toggle tests in a separate describe block with isolated cleanup
|
||
|
||
**Exact Changes:**
|
||
|
||
```typescript
|
||
// ADD after existing imports (around line 12)
|
||
import {
|
||
withSecurityEnabled,
|
||
captureSecurityState,
|
||
restoreSecurityState,
|
||
CapturedSecurityState,
|
||
} from '../utils/security-helpers';
|
||
```
|
||
|
||
```typescript
|
||
// REPLACE the entire 'Module Toggle Actions' describe block (lines ~80-180)
|
||
// with this safer implementation:
|
||
|
||
test.describe('Module Toggle Actions', () => {
|
||
// Capture state ONCE for this describe block
|
||
let originalState: CapturedSecurityState;
|
||
let request: APIRequestContext;
|
||
|
||
test.beforeAll(async ({ request: req }) => {
|
||
request = req;
|
||
originalState = await captureSecurityState(request);
|
||
});
|
||
|
||
test.afterAll(async () => {
|
||
// CRITICAL: Restore original state even if tests fail
|
||
if (originalState) {
|
||
await restoreSecurityState(request, originalState);
|
||
}
|
||
});
|
||
|
||
test('should toggle ACL enabled/disabled', async ({ page }) => {
|
||
const toggle = page.getByTestId('toggle-acl');
|
||
|
||
const isDisabled = await toggle.isDisabled();
|
||
if (isDisabled) {
|
||
test.info().annotations.push({
|
||
type: 'skip-reason',
|
||
description: 'Toggle is disabled because Cerberus security is not enabled',
|
||
});
|
||
test.skip();
|
||
return;
|
||
}
|
||
|
||
await test.step('Toggle ACL state', async () => {
|
||
await page.waitForLoadState('networkidle');
|
||
await toggle.scrollIntoViewIfNeeded();
|
||
await page.waitForTimeout(200);
|
||
await toggle.click({ force: true });
|
||
await waitForToast(page, /updated|success|enabled|disabled/i, 10000);
|
||
});
|
||
|
||
// NOTE: Do NOT toggle back here - afterAll handles cleanup
|
||
});
|
||
|
||
test('should toggle WAF enabled/disabled', async ({ page }) => {
|
||
const toggle = page.getByTestId('toggle-waf');
|
||
|
||
const isDisabled = await toggle.isDisabled();
|
||
if (isDisabled) {
|
||
test.info().annotations.push({
|
||
type: 'skip-reason',
|
||
description: 'Toggle is disabled because Cerberus security is not enabled',
|
||
});
|
||
test.skip();
|
||
return;
|
||
}
|
||
|
||
await test.step('Toggle WAF state', async () => {
|
||
await page.waitForLoadState('networkidle');
|
||
await toggle.scrollIntoViewIfNeeded();
|
||
await page.waitForTimeout(200);
|
||
await toggle.click({ force: true });
|
||
await waitForToast(page, /updated|success|enabled|disabled/i, 10000);
|
||
});
|
||
|
||
// NOTE: Do NOT toggle back here - afterAll handles cleanup
|
||
});
|
||
|
||
test('should toggle Rate Limiting enabled/disabled', async ({ page }) => {
|
||
const toggle = page.getByTestId('toggle-rate-limit');
|
||
|
||
const isDisabled = await toggle.isDisabled();
|
||
if (isDisabled) {
|
||
test.info().annotations.push({
|
||
type: 'skip-reason',
|
||
description: 'Toggle is disabled because Cerberus security is not enabled',
|
||
});
|
||
test.skip();
|
||
return;
|
||
}
|
||
|
||
await test.step('Toggle Rate Limit state', async () => {
|
||
await page.waitForLoadState('networkidle');
|
||
await toggle.scrollIntoViewIfNeeded();
|
||
await page.waitForTimeout(200);
|
||
await toggle.click({ force: true });
|
||
await waitForToast(page, /updated|success|enabled|disabled/i, 10000);
|
||
});
|
||
|
||
// NOTE: Do NOT toggle back here - afterAll handles cleanup
|
||
});
|
||
|
||
test('should persist toggle state after page reload', async ({ page }) => {
|
||
const toggle = page.getByTestId('toggle-acl');
|
||
|
||
const isDisabled = await toggle.isDisabled();
|
||
if (isDisabled) {
|
||
test.info().annotations.push({
|
||
type: 'skip-reason',
|
||
description: 'Toggle is disabled because Cerberus security is not enabled',
|
||
});
|
||
test.skip();
|
||
return;
|
||
}
|
||
|
||
const initialChecked = await toggle.isChecked();
|
||
|
||
await test.step('Toggle ACL state', async () => {
|
||
await page.waitForLoadState('networkidle');
|
||
await toggle.scrollIntoViewIfNeeded();
|
||
await page.waitForTimeout(200);
|
||
await toggle.click({ force: true });
|
||
await waitForToast(page, /updated|success|enabled|disabled/i, 10000);
|
||
});
|
||
|
||
await test.step('Reload page', async () => {
|
||
await page.reload();
|
||
await waitForLoadingComplete(page);
|
||
});
|
||
|
||
await test.step('Verify state persisted', async () => {
|
||
const newChecked = await page.getByTestId('toggle-acl').isChecked();
|
||
expect(newChecked).toBe(!initialChecked);
|
||
});
|
||
|
||
// NOTE: Do NOT restore here - afterAll handles cleanup
|
||
});
|
||
});
|
||
```
|
||
|
||
---
|
||
|
||
### File 3: `tests/global-setup.ts` (MODIFY)
|
||
|
||
**Add Emergency Security Reset:**
|
||
|
||
```typescript
|
||
// ADD to the end of the global setup function, before returning
|
||
|
||
// Import at top of file
|
||
import { request as playwrightRequest } from '@playwright/test';
|
||
import { existsSync, readFileSync } from 'fs';
|
||
import { STORAGE_STATE } from './constants';
|
||
|
||
// ADD in globalSetup function, after auth state is created:
|
||
|
||
async function emergencySecurityReset(baseURL: string) {
|
||
// Only run if auth state exists (meaning we can make authenticated requests)
|
||
if (!existsSync(STORAGE_STATE)) {
|
||
return;
|
||
}
|
||
|
||
try {
|
||
const authenticatedContext = await playwrightRequest.newContext({
|
||
baseURL,
|
||
storageState: STORAGE_STATE,
|
||
});
|
||
|
||
// Disable ACL to prevent deadlock from previous failed runs
|
||
await authenticatedContext.post('/api/v1/settings', {
|
||
data: { key: 'security.acl.enabled', value: 'false' },
|
||
});
|
||
|
||
await authenticatedContext.dispose();
|
||
console.log('✓ Security reset: ACL disabled');
|
||
} catch (error) {
|
||
console.warn('⚠️ Could not reset security state:', error);
|
||
}
|
||
}
|
||
|
||
// Call at end of globalSetup:
|
||
await emergencySecurityReset(process.env.PLAYWRIGHT_BASE_URL || 'http://localhost:8080');
|
||
```
|
||
|
||
---
|
||
|
||
### File 4: `tests/fixtures/auth-fixtures.ts` (OPTIONAL ENHANCEMENT)
|
||
|
||
**Add security fixture for tests that need it:**
|
||
|
||
```typescript
|
||
// ADD after existing imports
|
||
import {
|
||
withSecurityEnabled,
|
||
SecurityModuleOptions,
|
||
CapturedSecurityState,
|
||
captureSecurityState,
|
||
restoreSecurityState,
|
||
} from '../utils/security-helpers';
|
||
|
||
// ADD to AuthFixtures interface
|
||
interface AuthFixtures {
|
||
// ... existing fixtures ...
|
||
|
||
/**
|
||
* Security state manager for tests that need to toggle security modules.
|
||
* Automatically captures and restores state.
|
||
*/
|
||
securityState: {
|
||
enable: (options: SecurityModuleOptions) => Promise<void>;
|
||
captured: CapturedSecurityState | null;
|
||
};
|
||
}
|
||
|
||
// ADD fixture definition in test.extend
|
||
securityState: async ({ request }, use) => {
|
||
let capturedState: CapturedSecurityState | null = null;
|
||
|
||
const manager = {
|
||
enable: async (options: SecurityModuleOptions) => {
|
||
capturedState = await captureSecurityState(request);
|
||
const cleanup = await withSecurityEnabled(request, options);
|
||
// Store cleanup for afterAll
|
||
manager._cleanup = cleanup;
|
||
},
|
||
captured: capturedState,
|
||
_cleanup: null as (() => Promise<void>) | null,
|
||
};
|
||
|
||
await use(manager);
|
||
|
||
// Cleanup after test
|
||
if (manager._cleanup) {
|
||
await manager._cleanup();
|
||
}
|
||
},
|
||
```
|
||
|
||
---
|
||
|
||
## Execution Checklist
|
||
|
||
### Phase 1: Create Helper Module
|
||
|
||
- [ ] **1.1** Create `tests/utils/security-helpers.ts` with exact code from File 1 above
|
||
- [ ] **1.2** Run TypeScript check: `npx tsc --noEmit`
|
||
- [ ] **1.3** Verify helper imports correctly in a test file
|
||
|
||
### Phase 2: Update Security Dashboard Tests
|
||
|
||
- [ ] **2.1** Add imports to `tests/security/security-dashboard.spec.ts`
|
||
- [ ] **2.2** Replace 'Module Toggle Actions' describe block with new implementation
|
||
- [ ] **2.3** Run affected tests: `npx playwright test security-dashboard --project=chromium`
|
||
- [ ] **2.4** Verify tests pass AND cleanup happens (check security status after)
|
||
|
||
### Phase 3: Add Global Safety Net
|
||
|
||
- [ ] **3.1** Update `tests/global-setup.ts` with emergency security reset
|
||
- [ ] **3.2** Run full test suite: `npx playwright test --project=chromium`
|
||
- [ ] **3.3** Verify no ACL deadlock occurs across multiple runs
|
||
|
||
### Phase 4: Validation
|
||
|
||
- [ ] **4.1** Force a test failure (e.g., add `throw new Error()`) and verify cleanup still runs
|
||
- [ ] **4.2** Check security status after failed test: `curl localhost:8080/api/v1/security/status`
|
||
- [ ] **4.3** Confirm ACL is disabled after cleanup
|
||
- [ ] **4.4** Run full E2E suite 3 times consecutively to verify stability
|
||
|
||
---
|
||
|
||
## Benefits
|
||
|
||
1. **No deadlock**: Tests can safely enable/disable ACL with guaranteed cleanup
|
||
2. **Cleanup guaranteed**: `test.afterAll` runs even on failure
|
||
3. **Realistic testing**: ACL tests use the same toggle mechanism as users
|
||
4. **Isolation**: Other tests unaffected by ACL state
|
||
5. **Global safety net**: Even if individual cleanup fails, global setup resets state
|
||
|
||
## Risk Mitigation
|
||
|
||
| Risk | Mitigation |
|
||
|------|------------|
|
||
| Cleanup fails due to API error | Emergency fallback disables ACL specifically |
|
||
| Global setup can't reset state | Auth state file check prevents errors |
|
||
| Tests run in parallel | Each describe block has its own captured state |
|
||
| API changes break helpers | Settings keys are centralized in one const |
|
||
|
||
## Files Summary
|
||
|
||
| File | Action | Priority |
|
||
|------|--------|----------|
|
||
| `tests/utils/security-helpers.ts` | **CREATE** | Critical |
|
||
| `tests/security/security-dashboard.spec.ts` | **MODIFY** | Critical |
|
||
| `tests/global-setup.ts` | **MODIFY** | High |
|
||
| `tests/fixtures/auth-fixtures.ts` | **MODIFY** (Optional) | Low |
|