Restructure 7 modal components to use 3-layer architecture preventing
native select dropdown menus from being blocked by modal overlays.
Components fixed:
- ProxyHostForm: ACL selector and Security Headers dropdowns
- User management: Role and permission mode selection
- Uptime monitors: Monitor type selection (HTTP/TCP)
- Remote servers: Provider selection dropdown
- CrowdSec: IP ban duration selection
The fix separates modal background overlay (z-40) from form container
(z-50) and enables pointer events only on form content, allowing
native dropdown menus to render above all modal layers.
Resolves user inability to select security policies, user roles,
monitor types, and other critical configuration options through
the UI interface.
Investigation Phase:
Problem:
- Tests hang AFTER global setup completes
- No test execution begins (hung before first test)
- Step timeout (15min) doesn't trigger properly
- Job timeout (45min) eventually kills process after 44min
Changes:
1. Added DEBUG=pw:api to all browser jobs
- Will show exact Playwright API calls
- Pinpoint where execution hangs (auth setup vs browser launch vs test init)
2. Reduced job timeout: 45min → 20min
- Fail faster when tests hang
- Reduces wasted CI resources
- Still allows normal test execution (local: 1.2min)
Expected Outcome:
- Verbose logs reveal hang location
- Faster feedback loop (20min vs 44min)
- Can identify if issue is:
* auth.setup.ts hanging
* Browser process not launching
* Connection issues to application
Next Steps Based on Logs:
- If browser launch hangs: Add dumb-init (Phase 3)
- If auth setup hangs: Investigate cookie/storage state
- If network hangs: Add localhost loopback routing
Phase: 2.5 of 3 (Diagnostic Logging)
See: docs/plans/ci_hang_remediation.md
Resource Constraint Management:
Problem:
- Tests hanging indefinitely during execution in CI
- 2-core runners resource-constrained vs local dev machines
- No timeout enforcement allows tests to run forever
Changes:
1. playwright.config.js:
- Reduced per-test timeout: 90s → 60s (CI only)
- Comment clarifies CI resource constraints
- Local dev keeps 90s for debugging
2. .github/workflows/e2e-tests-split.yml:
- Added timeout-minutes: 15 to all test steps
- Ensures CI fails explicitly after 15 minutes
- Prevents workflow hanging until 6-hour GitHub limit
Expected Outcome:
- Tests fail fast with timeout error instead of hanging
- Clearer debugging: timeout vs hang vs test failure
- CI resources freed up faster for other jobs
Phase: 2 of 3 (Resource Constraints)
See: docs/plans/ci_hang_remediation.md
- Add curl retry mechanism (3 attempts) for GeoIP database download
- Add 30-second timeout to prevent hanging on network issues
- Create placeholder file if download fails or checksum mismatches
- Allows Docker build to complete even when external database unavailable
- GeoIP feature remains optional - users can provide own database at runtime
Fixes security-weekly-rebuild workflow failures