Files
Charon/docs/implementation/ci_remediation_summary.md
akanealw eec8c28fb3
Some checks failed
Go Benchmark / Performance Regression Check (push) Has been cancelled
Cerberus Integration / Cerberus Security Stack Integration (push) Has been cancelled
Upload Coverage to Codecov / Backend Codecov Upload (push) Has been cancelled
Upload Coverage to Codecov / Frontend Codecov Upload (push) Has been cancelled
CodeQL - Analyze / CodeQL analysis (go) (push) Has been cancelled
CodeQL - Analyze / CodeQL analysis (javascript-typescript) (push) Has been cancelled
CrowdSec Integration / CrowdSec Bouncer Integration (push) Has been cancelled
Docker Build, Publish & Test / build-and-push (push) Has been cancelled
Quality Checks / Auth Route Protection Contract (push) Has been cancelled
Quality Checks / Codecov Trigger/Comment Parity Guard (push) Has been cancelled
Quality Checks / Backend (Go) (push) Has been cancelled
Quality Checks / Frontend (React) (push) Has been cancelled
Rate Limit integration / Rate Limiting Integration (push) Has been cancelled
Security Scan (PR) / Trivy Binary Scan (push) Has been cancelled
Supply Chain Verification (PR) / Verify Supply Chain (push) Has been cancelled
WAF integration / Coraza WAF Integration (push) Has been cancelled
Docker Build, Publish & Test / Security Scan PR Image (push) Has been cancelled
Repo Health Check / Repo health (push) Has been cancelled
History Rewrite Dry-Run / Dry-run preview for history rewrite (push) Has been cancelled
Prune Renovate Branches / prune (push) Has been cancelled
Renovate / renovate (push) Has been cancelled
Nightly Build & Package / sync-development-to-nightly (push) Has been cancelled
Nightly Build & Package / Trigger Nightly Validation Workflows (push) Has been cancelled
Nightly Build & Package / build-and-push-nightly (push) Has been cancelled
Nightly Build & Package / test-nightly-image (push) Has been cancelled
Nightly Build & Package / verify-nightly-supply-chain (push) Has been cancelled
Update GeoLite2 Checksum / update-checksum (push) Has been cancelled
Container Registry Prune / prune-ghcr (push) Has been cancelled
Container Registry Prune / prune-dockerhub (push) Has been cancelled
Container Registry Prune / summarize (push) Has been cancelled
changed perms
2026-04-22 18:19:14 +00:00

31 lines
1.9 KiB
Markdown
Executable File

# CI Remediation Summary
**Date**: February 5, 2026
**Task**: Stabilize E2E testing pipeline and fix workflow timeouts.
## Problem
The end-to-end (E2E) testing pipeline was experiencing significant instability, characterized by:
1. **Workflow Timeouts**: Shard 4 was consistently timing out (>20 minutes), obstructing the CI process.
2. **Missing Dependencies**: Security jobs for Firefox and WebKit were failing because they lacked the required Chromium dependency.
3. **Flaky Tests**:
- `certificates.spec.ts` failed intermittently due to race conditions when ensuring either an empty state or a table was visible.
- `crowdsec-import.spec.ts` failed due to transient locks on the backend API.
## Solution
### Workflow Optimization
- **Shard Rebalancing**: Reduced the number of shards from 4 to 3. This seemingly counter-intuitive move rebalanced the test load, preventing the specific bottlenecks that were causing Shard 4 to hang.
- **Dependency Fix**: Explicitly added the Chromium installation step to Firefox and WebKit security jobs to ensure all shared test utilities function correctly.
### Test Logic Improvements
- **Robust Empty State Detection**: Replaced fragile boolean checks with Playwright's `.or()` locator pattern.
- *Old*: `isVisible().catch()` (Bypassed auto-waits, led to race conditions)
- *New*: `expect(locatorA.or(locatorB)).toBeVisible()` (Leverages built-in retry logic)
- **Resilient API Retries**: Implemented `.toPass()` for the CrowdSec import test.
- This allows the test to automatically retry the import request with exponential backoff if the backend is temporarily locked or busy, significantly reducing flakes.
## Results
- **Stability**: The "Empty State OR Table" flake in certificates is resolved.
- **Reliability**: CrowdSec import tests now handle transient backend states gracefully.
- **Performance**: CI jobs now complete within the allocated time budget with balanced shards.