Files
Charon/docs/plans/current_spec.md.backup
GitHub Actions f19632cdf8 fix(tests): enhance system settings tests with feature flag propagation and retry logic
- Added initial feature flag state verification before tests to ensure a stable starting point.
- Implemented retry logic with exponential backoff for toggling feature flags, improving resilience against transient failures.
- Introduced `waitForFeatureFlagPropagation` utility to replace hard-coded waits with condition-based verification for feature flag states.
- Added advanced test scenarios for handling concurrent toggle operations and retrying on network failures.
- Updated existing tests to utilize the new retry and propagation utilities for better reliability and maintainability.
2026-02-02 01:14:46 +00:00

43 lines
1.9 KiB
Plaintext

# Playwright E2E Test Timeout Fix - Feature Flags Endpoint
## 1. Introduction
### Overview
This plan addresses systematic timeout failures in Playwright E2E tests for the feature flags endpoint (`/feature-flags`) occurring consistently in CI environments. The tests in `tests/settings/system-settings.spec.ts` are failing due to timeouts when waiting for API responses during feature toggle operations.
### Problem Statement
Four tests are timing out in CI:
1. `should toggle Cerberus security feature`
2. `should toggle CrowdSec console enrollment`
3. `should toggle uptime monitoring`
4. `should persist feature toggle changes`
All tests follow the same pattern:
- Click toggle → Wait for PUT `/feature-flags` (currently 15s timeout)
- Wait for subsequent GET `/feature-flags` (currently 10s timeout)
- Both operations frequently exceed their timeouts in CI
### Root Cause Analysis
Based on comprehensive research, the timeout failures are caused by:
1. **Backend N+1 Query Pattern** (PRIMARY)
- `GetFlags()` makes 3 separate SQLite queries (one per feature flag)
- `UpdateFlags()` makes additional individual queries per flag
- Each toggle operation requires: 3 queries (PUT) + 3 queries (GET) = 6 DB operations minimum
2. **CI Environment Characteristics**
- Slower disk I/O compared to local development
- SQLite on CI runners lacks shared memory optimizations
- No database query caching layer
- Sequential query execution compounds latency
3. **Test Pattern Amplification**
- Tests explicitly set lower timeouts (15s, 10s) than helper defaults (30s)
- Immediate GET after PUT doesn't allow for state propagation
- No retry logic for transient failures
### Objectives
1. **Immediate**: Increase timeouts and add strategic waits to fix CI failures
2. **Short-term**: Improve test reliability with better wait strategies
3. **Long-term**: Document backend performance optimization opportunities