chore: Enhance documentation for E2E testing:
- Added clarity and structure to README files, including recent updates and getting started sections. - Improved manual verification documentation for CrowdSec authentication, emphasizing expected outputs and success criteria. - Updated debugging guide with detailed output examples and automatic trace capture information. - Refined best practices for E2E tests, focusing on efficient polling, locator strategies, and state management. - Documented triage report for DNS Provider feature tests, highlighting issues fixed and test results before and after improvements. - Revised E2E test writing guide to include when to use specific helper functions and patterns for better test reliability. - Enhanced troubleshooting documentation with clear resolutions for common issues, including timeout and token configuration problems. - Updated tests README to provide quick links and best practices for writing robust tests.
This commit is contained in:
@@ -11,11 +11,13 @@
|
||||
**File**: `tests/settings/system-settings.spec.ts`
|
||||
|
||||
**Changes Made**:
|
||||
|
||||
1. **Removed** `waitForFeatureFlagPropagation()` call from `beforeEach` hook (lines 35-46)
|
||||
- This was causing 10s × 31 tests = 310s of polling overhead per shard
|
||||
- Commented out with clear explanation linking to remediation plan
|
||||
|
||||
2. **Added** `test.afterEach()` hook with direct API state restoration:
|
||||
|
||||
```typescript
|
||||
test.afterEach(async ({ page }) => {
|
||||
await test.step('Restore default feature flag state', async () => {
|
||||
@@ -34,12 +36,14 @@
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
|
||||
- Tests already verify feature flag state individually after toggle actions
|
||||
- Initial state verification in beforeEach was redundant
|
||||
- Explicit cleanup in afterEach ensures test isolation without polling overhead
|
||||
- Direct API mutation for state restoration is faster than polling
|
||||
|
||||
**Expected Impact**:
|
||||
|
||||
- 310s saved per shard (10s × 31 tests)
|
||||
- Elimination of inter-test dependencies
|
||||
- No state leakage between tests
|
||||
@@ -51,12 +55,14 @@
|
||||
**Changes Made**:
|
||||
|
||||
1. **Added module-level cache** for in-flight requests:
|
||||
|
||||
```typescript
|
||||
// Cache for in-flight requests (per-worker isolation)
|
||||
const inflightRequests = new Map<string, Promise<Record<string, boolean>>>();
|
||||
```
|
||||
|
||||
2. **Implemented cache key generation** with sorted keys and worker isolation:
|
||||
|
||||
```typescript
|
||||
function generateCacheKey(
|
||||
expectedFlags: Record<string, boolean>,
|
||||
@@ -81,6 +87,7 @@
|
||||
- Removes promise from cache after completion (success or failure)
|
||||
|
||||
4. **Added cleanup function**:
|
||||
|
||||
```typescript
|
||||
export function clearFeatureFlagCache(): void {
|
||||
inflightRequests.clear();
|
||||
@@ -89,16 +96,19 @@
|
||||
```
|
||||
|
||||
**Why Sorted Keys?**
|
||||
|
||||
- `{a:true, b:false}` vs `{b:false, a:true}` are semantically identical
|
||||
- Without sorting, they generate different cache keys → cache misses
|
||||
- Sorting ensures consistent key regardless of property order
|
||||
|
||||
**Why Worker Isolation?**
|
||||
|
||||
- Playwright workers run in parallel across different browser contexts
|
||||
- Each worker needs its own cache to avoid state conflicts
|
||||
- Worker index provides unique namespace per parallel process
|
||||
|
||||
**Expected Impact**:
|
||||
|
||||
- 30-40% reduction in duplicate API calls (revised from original 70-80% estimate)
|
||||
- Cache hit rate should be >30% based on similar flag state checks
|
||||
- Reduced API server load during parallel test execution
|
||||
@@ -108,21 +118,26 @@
|
||||
**Status**: Partially Investigated
|
||||
|
||||
**Issue**:
|
||||
|
||||
- Test: `tests/dns-provider-types.spec.ts` (line 260)
|
||||
- Symptom: Label locator `/script.*path/i` passes in Chromium, fails in Firefox/WebKit
|
||||
- Test code:
|
||||
|
||||
```typescript
|
||||
const scriptField = page.getByLabel(/script.*path/i);
|
||||
await expect(scriptField).toBeVisible({ timeout: 10000 });
|
||||
```
|
||||
|
||||
**Investigation Steps Completed**:
|
||||
|
||||
1. ✅ Confirmed E2E environment is running and healthy
|
||||
2. ✅ Attempted to run DNS provider type tests in Chromium
|
||||
3. ⏸️ Further investigation deferred due to test execution issues
|
||||
|
||||
**Investigation Steps Remaining** (per spec):
|
||||
|
||||
1. Run with Playwright Inspector to compare accessibility trees:
|
||||
|
||||
```bash
|
||||
npx playwright test tests/dns-provider-types.spec.ts --project=chromium --headed --debug
|
||||
npx playwright test tests/dns-provider-types.spec.ts --project=firefox --headed --debug
|
||||
@@ -137,6 +152,7 @@
|
||||
5. If not fixable: Use the helper function approach from Phase 2
|
||||
|
||||
**Recommendation**:
|
||||
|
||||
- Complete investigation in separate session with headed browser mode
|
||||
- DO NOT add `.or()` chains unless investigation proves it's necessary
|
||||
- Create formal Decision Record once root cause is identified
|
||||
@@ -144,31 +160,37 @@
|
||||
## Validation Checkpoints
|
||||
|
||||
### Checkpoint 1: Execution Time
|
||||
|
||||
**Status**: ⏸️ In Progress
|
||||
|
||||
**Target**: <15 minutes (900s) for full test suite
|
||||
|
||||
**Command**:
|
||||
|
||||
```bash
|
||||
time npx playwright test tests/settings/system-settings.spec.ts --project=chromium
|
||||
```
|
||||
|
||||
**Results**:
|
||||
|
||||
- Test execution interrupted during validation
|
||||
- Observed: Tests were picking up multiple spec files from security/ folder
|
||||
- Need to investigate test file patterns or run with more specific filtering
|
||||
|
||||
**Action Required**:
|
||||
|
||||
- Re-run with corrected test file path or filtering
|
||||
- Ensure only system-settings tests are executed
|
||||
- Measure execution time and compare to baseline
|
||||
|
||||
### Checkpoint 2: Test Isolation
|
||||
|
||||
**Status**: ⏳ Pending
|
||||
|
||||
**Target**: All tests pass with `--repeat-each=5 --workers=4`
|
||||
|
||||
**Command**:
|
||||
|
||||
```bash
|
||||
npx playwright test tests/settings/system-settings.spec.ts --project=chromium --repeat-each=5 --workers=4
|
||||
```
|
||||
@@ -176,11 +198,13 @@ npx playwright test tests/settings/system-settings.spec.ts --project=chromium --
|
||||
**Status**: Not executed yet
|
||||
|
||||
### Checkpoint 3: Cross-browser
|
||||
|
||||
**Status**: ⏳ Pending
|
||||
|
||||
**Target**: Firefox/WebKit pass rate >85%
|
||||
|
||||
**Command**:
|
||||
|
||||
```bash
|
||||
npx playwright test tests/settings/system-settings.spec.ts --project=firefox --project=webkit
|
||||
```
|
||||
@@ -188,11 +212,13 @@ npx playwright test tests/settings/system-settings.spec.ts --project=firefox --p
|
||||
**Status**: Not executed yet
|
||||
|
||||
### Checkpoint 4: DNS provider tests (secondary issue)
|
||||
|
||||
**Status**: ⏳ Pending
|
||||
|
||||
**Target**: Firefox tests pass or investigation complete
|
||||
|
||||
**Command**:
|
||||
|
||||
```bash
|
||||
npx playwright test tests/dns-provider-types.spec.ts --project=firefox
|
||||
```
|
||||
@@ -204,11 +230,13 @@ npx playwright test tests/dns-provider-types.spec.ts --project=firefox
|
||||
### Decision: Use Direct API Mutation for State Restoration
|
||||
|
||||
**Context**:
|
||||
|
||||
- Tests need to restore default feature flag state after modifications
|
||||
- Original approach used polling-based verification in beforeEach
|
||||
- Alternative approaches: polling in afterEach vs direct API mutation
|
||||
|
||||
**Options Evaluated**:
|
||||
|
||||
1. **Polling in afterEach** - Verify state propagated after mutation
|
||||
- Pros: Confirms state is actually restored
|
||||
- Cons: Adds 500ms-2s per test (polling overhead)
|
||||
@@ -219,12 +247,14 @@ npx playwright test tests/dns-provider-types.spec.ts --project=firefox
|
||||
- Why chosen: Feature flag updates are synchronous in backend
|
||||
|
||||
**Rationale**:
|
||||
|
||||
- Feature flag updates via PUT /api/v1/feature-flags are processed synchronously
|
||||
- Database write is immediate (SQLite WAL mode)
|
||||
- No async propagation delay in single-process test environment
|
||||
- Subsequent tests will verify state on first read, catching any issues
|
||||
|
||||
**Impact**:
|
||||
|
||||
- Test runtime reduced by 15-60s per test file (31 tests × 500ms-2s polling)
|
||||
- Risk: If state restoration fails, next test will fail loudly (detectable)
|
||||
- Acceptable trade-off for 10-20% execution time improvement
|
||||
@@ -234,15 +264,18 @@ npx playwright test tests/dns-provider-types.spec.ts --project=firefox
|
||||
### Decision: Cache Key Sorting for Semantic Equality
|
||||
|
||||
**Context**:
|
||||
|
||||
- Multiple tests may check the same feature flag state but with different property order
|
||||
- Without normalization, `{a:true, b:false}` and `{b:false, a:true}` generate different keys
|
||||
|
||||
**Rationale**:
|
||||
|
||||
- JavaScript objects have insertion order, but semantically these are identical states
|
||||
- Sorting keys ensures cache hits for semantically identical flag states
|
||||
- Minimal performance cost (~1ms for sorting 3-5 keys)
|
||||
|
||||
**Impact**:
|
||||
|
||||
- Estimated 10-15% cache hit rate improvement
|
||||
- No downside - pure optimization
|
||||
|
||||
|
||||
Reference in New Issue
Block a user