Files
Charon/docs/plans/archive/ci_failure_fix.md
akanealw eec8c28fb3
Some checks failed
Go Benchmark / Performance Regression Check (push) Has been cancelled
Cerberus Integration / Cerberus Security Stack Integration (push) Has been cancelled
Upload Coverage to Codecov / Backend Codecov Upload (push) Has been cancelled
Upload Coverage to Codecov / Frontend Codecov Upload (push) Has been cancelled
CodeQL - Analyze / CodeQL analysis (go) (push) Has been cancelled
CodeQL - Analyze / CodeQL analysis (javascript-typescript) (push) Has been cancelled
CrowdSec Integration / CrowdSec Bouncer Integration (push) Has been cancelled
Docker Build, Publish & Test / build-and-push (push) Has been cancelled
Quality Checks / Auth Route Protection Contract (push) Has been cancelled
Quality Checks / Codecov Trigger/Comment Parity Guard (push) Has been cancelled
Quality Checks / Backend (Go) (push) Has been cancelled
Quality Checks / Frontend (React) (push) Has been cancelled
Rate Limit integration / Rate Limiting Integration (push) Has been cancelled
Security Scan (PR) / Trivy Binary Scan (push) Has been cancelled
Supply Chain Verification (PR) / Verify Supply Chain (push) Has been cancelled
WAF integration / Coraza WAF Integration (push) Has been cancelled
Docker Build, Publish & Test / Security Scan PR Image (push) Has been cancelled
Repo Health Check / Repo health (push) Has been cancelled
History Rewrite Dry-Run / Dry-run preview for history rewrite (push) Has been cancelled
Prune Renovate Branches / prune (push) Has been cancelled
Renovate / renovate (push) Has been cancelled
Nightly Build & Package / sync-development-to-nightly (push) Has been cancelled
Nightly Build & Package / Trigger Nightly Validation Workflows (push) Has been cancelled
Nightly Build & Package / build-and-push-nightly (push) Has been cancelled
Nightly Build & Package / test-nightly-image (push) Has been cancelled
Nightly Build & Package / verify-nightly-supply-chain (push) Has been cancelled
changed perms
2026-04-22 18:19:14 +00:00

102 lines
3.8 KiB
Markdown
Executable File

# CI Failure Fix Plan
## Status: RESOLVED ✅
## Problem Statement
The CI pipeline failed on the feature/beta-release branch due to a WAF Integration Test failure. The failure was in workflow run #163, NOT in the referenced run #20452768958 (which was cancelled, not failed).
## Workflow Run Information
- **Failed Run**: <https://github.com/Wikid82/Charon/actions/runs/20449607151>
- **Cancelled Run** (not the issue): <https://github.com/Wikid82/Charon/actions/runs/20452768958>
- **Branch**: feature/beta-release
- **Failed Job**: Coraza WAF Integration
- **Commit**: 0543a15 (fix(security): resolve CrowdSec startup permission failures)
- **Fixed In**: 430eb85 (fix(integration): resolve WAF test authentication order)
## Root Cause Analysis
### Actual Failure (from logs)
The WAF integration test failed with **HTTP 401 Unauthorized** when attempting to create a proxy host:
```
{"client":"172.18.0.1","latency":"433.811µs","level":"info","method":"POST",
"msg":"handled request","path":"/api/v1/proxy-hosts","request_id":"26716960-4547-496b-8271-2acdcdda9872",
"status":401}
```
### Root Cause
The `scripts/coraza_integration.sh` test script had an **authentication ordering bug**:
1. Script attempted to create proxy host **WITHOUT** authentication cookie
2. API endpoint `/api/v1/proxy-hosts` requires authentication (returns 401)
3. Script then authenticated and obtained session cookie (too late)
4. Subsequent API calls correctly used the cookie
### Why This Occurred
The proxy host creation endpoints were moved to the authenticated API group in a previous commit, but the integration test script was not updated to authenticate before creating proxy hosts.
## Fix Implementation (Already Applied)
**Commit**: 430eb85c9f020515bf4fdc5211e32c3ce5c26877
### Changes Made to `scripts/coraza_integration.sh`
1. **Moved authentication block** from line ~207 to after line 146 (after API ready check, before proxy host creation)
2. **Added `-b ${TMP_COOKIE}`** to proxy host creation curl command
3. **Added `-b ${TMP_COOKIE}`** to proxy host list curl command (for fallback logic)
4. **Added `-b ${TMP_COOKIE}`** to proxy host update curl command (for fallback logic)
5. **Removed duplicate** authentication block that was executing too late
### Fixed Flow
```
1. Build/start containers
2. Wait for API ready
3. ✅ Register user and login (create session cookie)
4. Start httpbin backend
5. ✅ Create proxy host WITH authentication
6. Create WAF ruleset with authentication
7. Enable WAF globally with authentication
8. Run WAF tests (BLOCK and MONITOR modes)
9. Cleanup
```
## Verification Steps
**Completed Successfully**
1. WAF Integration Tests workflow run #164 passed after the fix
2. Proxy host creation returned HTTP 201 (Created) instead of 401
3. All subsequent WAF tests (BLOCK mode and MONITOR mode) passed
4. No regressions in other CI workflows
## Related Files
- `scripts/coraza_integration.sh` - Fixed authentication ordering
- `docs/plans/waf_integration_fix.md` - Detailed analysis document
- `.github/workflows/waf-integration.yml` - CI workflow definition
## Key Learnings
1. **Always check ACTUAL logs** - The initially referenced run was cancelled, not failed
2. **Authentication order matters** - API endpoints that require auth must have credentials passed from the start
3. **Integration tests must track API changes** - When routes move to authenticated groups, tests must be updated
## Previous Incorrect Analysis
The initial analysis incorrectly focused on Go version 1.25.5 as a potential issue. This was completely incorrect:
- Go 1.25.5 is the current correct version (released Dec 2, 2025)
- No Go version issues existed
- The actual failure was an integration test authentication bug
- Lesson: Always examine actual error messages instead of making assumptions
---
**Resolution**: Issue fixed in commit 430eb85 and verified in subsequent CI runs.