Some checks are pending
Go Benchmark / Performance Regression Check (push) Waiting to run
Cerberus Integration / Cerberus Security Stack Integration (push) Waiting to run
Upload Coverage to Codecov / Backend Codecov Upload (push) Waiting to run
Upload Coverage to Codecov / Frontend Codecov Upload (push) Waiting to run
CodeQL - Analyze / CodeQL analysis (go) (push) Waiting to run
CodeQL - Analyze / CodeQL analysis (javascript-typescript) (push) Waiting to run
CrowdSec Integration / CrowdSec Bouncer Integration (push) Waiting to run
Docker Build, Publish & Test / build-and-push (push) Waiting to run
Docker Build, Publish & Test / Security Scan PR Image (push) Blocked by required conditions
Quality Checks / Auth Route Protection Contract (push) Waiting to run
Quality Checks / Codecov Trigger/Comment Parity Guard (push) Waiting to run
Quality Checks / Backend (Go) (push) Waiting to run
Quality Checks / Frontend (React) (push) Waiting to run
Rate Limit integration / Rate Limiting Integration (push) Waiting to run
Security Scan (PR) / Trivy Binary Scan (push) Waiting to run
Supply Chain Verification (PR) / Verify Supply Chain (push) Waiting to run
WAF integration / Coraza WAF Integration (push) Waiting to run
1.9 KiB
Executable File
1.9 KiB
Executable File
CI Remediation Summary
Date: February 5, 2026 Task: Stabilize E2E testing pipeline and fix workflow timeouts.
Problem
The end-to-end (E2E) testing pipeline was experiencing significant instability, characterized by:
- Workflow Timeouts: Shard 4 was consistently timing out (>20 minutes), obstructing the CI process.
- Missing Dependencies: Security jobs for Firefox and WebKit were failing because they lacked the required Chromium dependency.
- Flaky Tests:
certificates.spec.tsfailed intermittently due to race conditions when ensuring either an empty state or a table was visible.crowdsec-import.spec.tsfailed due to transient locks on the backend API.
Solution
Workflow Optimization
- Shard Rebalancing: Reduced the number of shards from 4 to 3. This seemingly counter-intuitive move rebalanced the test load, preventing the specific bottlenecks that were causing Shard 4 to hang.
- Dependency Fix: Explicitly added the Chromium installation step to Firefox and WebKit security jobs to ensure all shared test utilities function correctly.
Test Logic Improvements
- Robust Empty State Detection: Replaced fragile boolean checks with Playwright's
.or()locator pattern.- Old:
isVisible().catch()(Bypassed auto-waits, led to race conditions) - New:
expect(locatorA.or(locatorB)).toBeVisible()(Leverages built-in retry logic)
- Old:
- Resilient API Retries: Implemented
.toPass()for the CrowdSec import test.- This allows the test to automatically retry the import request with exponential backoff if the backend is temporarily locked or busy, significantly reducing flakes.
Results
- Stability: The "Empty State OR Table" flake in certificates is resolved.
- Reliability: CrowdSec import tests now handle transient backend states gracefully.
- Performance: CI jobs now complete within the allocated time budget with balanced shards.