fix(tests): enhance system settings tests with feature flag propagation and retry logic
- Added initial feature flag state verification before tests to ensure a stable starting point. - Implemented retry logic with exponential backoff for toggling feature flags, improving resilience against transient failures. - Introduced `waitForFeatureFlagPropagation` utility to replace hard-coded waits with condition-based verification for feature flag states. - Added advanced test scenarios for handling concurrent toggle operations and retrying on network failures. - Updated existing tests to utilize the new retry and propagation utilities for better reliability and maintainability.
This commit is contained in:
42
docs/plans/current_spec.md.backup
Normal file
42
docs/plans/current_spec.md.backup
Normal file
@@ -0,0 +1,42 @@
|
||||
# Playwright E2E Test Timeout Fix - Feature Flags Endpoint
|
||||
|
||||
## 1. Introduction
|
||||
|
||||
### Overview
|
||||
This plan addresses systematic timeout failures in Playwright E2E tests for the feature flags endpoint (`/feature-flags`) occurring consistently in CI environments. The tests in `tests/settings/system-settings.spec.ts` are failing due to timeouts when waiting for API responses during feature toggle operations.
|
||||
|
||||
### Problem Statement
|
||||
Four tests are timing out in CI:
|
||||
1. `should toggle Cerberus security feature`
|
||||
2. `should toggle CrowdSec console enrollment`
|
||||
3. `should toggle uptime monitoring`
|
||||
4. `should persist feature toggle changes`
|
||||
|
||||
All tests follow the same pattern:
|
||||
- Click toggle → Wait for PUT `/feature-flags` (currently 15s timeout)
|
||||
- Wait for subsequent GET `/feature-flags` (currently 10s timeout)
|
||||
- Both operations frequently exceed their timeouts in CI
|
||||
|
||||
### Root Cause Analysis
|
||||
Based on comprehensive research, the timeout failures are caused by:
|
||||
|
||||
1. **Backend N+1 Query Pattern** (PRIMARY)
|
||||
- `GetFlags()` makes 3 separate SQLite queries (one per feature flag)
|
||||
- `UpdateFlags()` makes additional individual queries per flag
|
||||
- Each toggle operation requires: 3 queries (PUT) + 3 queries (GET) = 6 DB operations minimum
|
||||
|
||||
2. **CI Environment Characteristics**
|
||||
- Slower disk I/O compared to local development
|
||||
- SQLite on CI runners lacks shared memory optimizations
|
||||
- No database query caching layer
|
||||
- Sequential query execution compounds latency
|
||||
|
||||
3. **Test Pattern Amplification**
|
||||
- Tests explicitly set lower timeouts (15s, 10s) than helper defaults (30s)
|
||||
- Immediate GET after PUT doesn't allow for state propagation
|
||||
- No retry logic for transient failures
|
||||
|
||||
### Objectives
|
||||
1. **Immediate**: Increase timeouts and add strategic waits to fix CI failures
|
||||
2. **Short-term**: Improve test reliability with better wait strategies
|
||||
3. **Long-term**: Document backend performance optimization opportunities
|
||||
Reference in New Issue
Block a user