18 KiB
Executable File
Phase 1.4: Deep Diagnostic Investigation
Date: February 2, 2026 Phase: Deep Diagnostic Investigation Duration: 2-3 hours Status: In Progress
Executive Summary
Investigation of Chromium test interruption at certificates.spec.ts:788 reveals multiple anti-patterns and potential root causes for browser context closure. This report documents findings and provides actionable recommendations for Phase 2 remediation.
Interrupted Tests Analysis
Test 1: Keyboard Navigation (Line 788)
File: tests/core/certificates.spec.ts:788-806
Test Name: should be keyboard navigable
test('should be keyboard navigable', async ({ page }) => {
await test.step('Navigate form with keyboard', async () => {
await getAddCertButton(page).click();
await page.waitForTimeout(500); // ❌ Anti-pattern #1
// Tab through form fields
await page.keyboard.press('Tab');
await page.keyboard.press('Tab');
await page.keyboard.press('Tab');
// Some element should be focused
const focusedElement = page.locator(':focus');
const hasFocus = await focusedElement.isVisible().catch(() => false);
expect(hasFocus || true).toBeTruthy(); // ❌ Anti-pattern #2 - Always passes
await getCancelButton(page).click(); // ❌ Anti-pattern #3 - May fail if dialog closing
});
});
Identified Anti-Patterns:
-
Arbitrary Timeout (Line 791):
await page.waitForTimeout(500)- Issue: Creates race condition - dialog may not be fully rendered in 500ms in CI
- Impact: Test may try to interact with dialog before it's ready
- Proper Solution:
await waitForDialog(page)with visibility check
-
Weak Assertion (Line 799):
expect(hasFocus || true).toBeTruthy()- Issue: Always passes regardless of actual focus state
- Impact: Test provides false confidence - cannot detect focus issues
- Proper Solution:
await expect(nameInput).toBeFocused()for specific elements
-
Missing Cleanup Verification (Line 801):
await getCancelButton(page).click()- Issue: No verification that dialog actually closed
- Impact: If close fails, page state is inconsistent for next test
- Proper Solution:
await expect(dialog).not.toBeVisible()after click
Test 2: Escape Key Handling (Line 807)
File: tests/core/certificates.spec.ts:807-821
Test Name: should close dialog on Escape key
test('should close dialog on Escape key', async ({ page }) => {
await test.step('Close with Escape key', async () => {
await getAddCertButton(page).click();
await page.waitForTimeout(500); // ❌ Anti-pattern #1
const dialog = page.getByRole('dialog');
await expect(dialog).toBeVisible();
await page.keyboard.press('Escape');
// Dialog may or may not close on Escape depending on implementation
await page.waitForTimeout(500); // ❌ Anti-pattern #2 - No verification
});
});
Identified Anti-Patterns:
-
Arbitrary Timeout (Line 810):
await page.waitForTimeout(500)- Issue: Same as above - race condition on dialog render
- Impact: Inconsistent test behavior between local and CI
-
No Verification (Line 818):
await page.waitForTimeout(500)after Escape- Issue: Test doesn't verify dialog actually closed
- Impact: Cannot detect Escape key handler failures
- Comment admits uncertainty: "Dialog may or may not close"
- Proper Solution:
await expect(dialog).not.toBeVisible()with timeout
Root Cause Hypothesis
Primary Hypothesis: Resource Leak in Dialog Lifecycle
Theory: The dialog component is not properly cleaning up browser contexts when closed, leading to orphaned resources.
Evidence:
- Interruption occurs during accessibility tests that open/close dialogs multiple times
- Error message: "Target page, context or browser has been closed"
- This is NOT a normal test failure
- Indicates the browser context was terminated unexpectedly
- Timing sensitive: Works locally (fast), fails in CI (slower, more load)
- Weak cleanup: Tests don't verify dialog is actually closed before continuing
Mechanism:
- Test opens dialog →
getAddCertButton(page).click() - Test waits arbitrary 500ms →
page.waitForTimeout(500) - In CI, dialog takes 600ms to render (race condition)
- Test interacts with partially-rendered dialog
- Test closes dialog →
getCancelButton(page).click() - Dialog close is initiated but not completed
- Next test runs while dialog cleanup is still in progress
- Resource contention causes browser context to close
- Playwright detects context closure → Interruption
- Worker terminates → Firefox/WebKit never start
Secondary Hypothesis: Memory Leak in Form Interactions
Theory: Each dialog open/close cycle leaks memory, eventually exhausting resources at test #263.
Evidence:
- Interruption at specific test number (263) suggests accumulation over time
- Accessibility tests run many dialog interactions before interruption
- CI environment has limited resources compared to local development
Mechanism:
- Each test leaks a small amount of memory (unclosed event listeners, DOM nodes)
- After 262 tests, accumulated memory usage reaches threshold
- Browser triggers garbage collection during test #263
- GC encounters orphaned dialog resources
- Cleanup fails, triggers context termination
- Test interruption occurs
Tertiary Hypothesis: Dialog Event Handler Race Condition
Theory: Cancel button click and Escape key press trigger competing event handlers, causing state corruption.
Evidence:
- Both interrupted tests involve dialog closure (click Cancel vs press Escape)
- No verification of closure completion before test ends
- React state updates may be async and incomplete
Mechanism:
- Test closes dialog via Cancel button or Escape key
- React state update is initiated (async)
- Test ends before state update completes
- Next test starts, tries to open new dialog
- React detects inconsistent state (old dialog still mounted in virtual DOM)
- Error in React reconciliation crashes the app
- Browser context terminates
- Test interruption occurs
Diagnostic Actions Taken
1. Browser Console Logging Enhancement
File Created: tests/utils/diagnostic-helpers.ts
import { Page, ConsoleMessage, Request } from '@playwright/test';
/**
* Enable comprehensive browser console logging for diagnostic purposes
* Captures console logs, page errors, request failures, and unhandled rejections
*/
export function enableDiagnosticLogging(page: Page): void {
// Console messages (all levels)
page.on('console', (msg: ConsoleMessage) => {
const type = msg.type().toUpperCase();
const text = msg.text();
const location = msg.location();
console.log(`[BROWSER ${type}] ${text}`);
if (location.url) {
console.log(` Location: ${location.url}:${location.lineNumber}:${location.columnNumber}`);
}
});
// Page errors (JavaScript exceptions)
page.on('pageerror', (error: Error) => {
console.error('═══════════════════════════════════════════');
console.error('PAGE ERROR DETECTED');
console.error('═══════════════════════════════════════════');
console.error('Message:', error.message);
console.error('Stack:', error.stack);
console.error('═══════════════════════════════════════════');
});
// Request failures (network errors)
page.on('requestfailed', (request: Request) => {
const failure = request.failure();
console.error('─────────────────────────────────────────');
console.error('REQUEST FAILED');
console.error('─────────────────────────────────────────');
console.error('URL:', request.url());
console.error('Method:', request.method());
console.error('Error:', failure?.errorText || 'Unknown');
console.error('─────────────────────────────────────────');
});
// Unhandled promise rejections
page.on('console', (msg: ConsoleMessage) => {
if (msg.type() === 'error' && msg.text().includes('Unhandled')) {
console.error('╔═══════════════════════════════════════════╗');
console.error('║ UNHANDLED PROMISE REJECTION DETECTED ║');
console.error('╚═══════════════════════════════════════════╝');
console.error(msg.text());
}
});
// Dialog events (if supported)
page.on('dialog', async (dialog) => {
console.log(`[DIALOG] Type: ${dialog.type()}, Message: ${dialog.message()}`);
await dialog.dismiss();
});
}
/**
* Capture page state snapshot for debugging
*/
export async function capturePageState(page: Page, label: string): Promise<void> {
const url = page.url();
const title = await page.title();
const html = await page.content();
console.log(`\n========== PAGE STATE: ${label} ==========`);
console.log(`URL: ${url}`);
console.log(`Title: ${title}`);
console.log(`HTML Length: ${html.length} characters`);
console.log(`===========================================\n`);
}
Integration Example:
// Add to tests/core/certificates.spec.ts
import { enableDiagnosticLogging } from '../utils/diagnostic-helpers';
test.describe('Form Accessibility', () => {
test.beforeEach(async ({ page }) => {
enableDiagnosticLogging(page);
await navigateToCertificates(page);
});
// ... existing tests
});
2. Enhanced Error Reporting in certificates.spec.ts
Recommendation: Add detailed logging around interrupted tests:
test('should be keyboard navigable', async ({ page }) => {
console.log(`\n[TEST START] Keyboard navigation test at ${new Date().toISOString()}`);
await test.step('Open dialog', async () => {
console.log('[STEP 1] Opening certificate upload dialog...');
await getAddCertButton(page).click();
console.log('[STEP 1] Waiting for dialog to be visible...');
const dialog = await waitForDialog(page); // Replace waitForTimeout
await expect(dialog).toBeVisible();
console.log('[STEP 1] Dialog is visible and ready');
});
await test.step('Navigate with Tab key', async () => {
console.log('[STEP 2] Testing keyboard navigation...');
await page.keyboard.press('Tab');
const nameInput = page.getByRole('dialog').locator('input').first();
await expect(nameInput).toBeFocused();
console.log('[STEP 2] First input (name) received focus ✓');
await page.keyboard.press('Tab');
const certInput = page.getByRole('dialog').locator('#cert-file');
await expect(certInput).toBeFocused();
console.log('[STEP 2] Certificate input received focus ✓');
});
await test.step('Close dialog', async () => {
console.log('[STEP 3] Closing dialog...');
const dialog = page.getByRole('dialog');
await getCancelButton(page).click();
console.log('[STEP 3] Verifying dialog closed...');
await expect(dialog).not.toBeVisible({ timeout: 5000 });
console.log('[STEP 3] Dialog closed successfully ✓');
});
console.log(`[TEST END] Keyboard navigation test completed at ${new Date().toISOString()}\n`);
});
3. Backend Health Monitoring
Action: Capture backend logs during test execution to detect crashes or timeouts.
# Add to CI workflow after test failure
- name: Collect backend logs
if: failure()
run: |
echo "Collecting Charon backend logs..."
docker logs charon-e2e > backend-logs.txt 2>&1
echo "Searching for errors, panics, or crashes..."
grep -i "error\|panic\|fatal\|crash" backend-logs.txt || echo "No critical errors found"
echo "Last 100 lines of logs:"
tail -100 backend-logs.txt
Verification Plan
Local Reproduction
Goal: Reproduce interruption locally to validate diagnostic enhancements.
Steps:
-
Enable diagnostic logging:
# Set environment variable to enable verbose logging export DEBUG=pw:api,charon:* -
Run interrupted tests in isolation:
# Test 1: Run only the interrupted test npx playwright test tests/core/certificates.spec.ts:788 --project=chromium --headed # Test 2: Run entire accessibility suite npx playwright test tests/core/certificates.spec.ts --grep="accessibility" --project=chromium --headed # Test 3: Run with trace npx playwright test tests/core/certificates.spec.ts:788 --project=chromium --trace=on -
Simulate CI environment:
# Run with CI settings (workers=1, retries=2) CI=1 npx playwright test tests/core/certificates.spec.ts --project=chromium --workers=1 --retries=2 -
Analyze trace files:
# Open trace viewer npx playwright show-trace test-results/*/trace.zip # Check for: # - Browser context lifetime # - Dialog open/close events # - Memory usage over time # - Network requests during disruption
Expected Diagnostic Outputs
If Hypothesis 1 (Resource Leak) is correct:
- Browser console shows warnings about unclosed resources
- Trace shows dialog DOM nodes persist after close
- Memory usage increases gradually across tests
- Context termination occurs after cleanup attempt
If Hypothesis 2 (Memory Leak) is correct:
- Memory usage climbs steadily up to test #263
- Garbage collection triggers during test execution
- Browser console shows "out of memory" or similar
- Context terminates during or after GC
If Hypothesis 3 (Race Condition) is correct:
- React state update errors in console
- Multiple close handlers fire simultaneously
- Dialog state inconsistent between virtual DOM and actual DOM
- Error occurs specifically during state reconciliation
Findings Summary
| Finding | Severity | Impact | Remediation |
|---|---|---|---|
Arbitrary timeouts (page.waitForTimeout) |
HIGH | Race conditions in CI | Replace with semantic wait helpers |
Weak assertions (expect(x || true)) |
HIGH | False confidence in tests | Use specific assertions |
| Missing cleanup verification | HIGH | Inconsistent page state | Add explicit close verification |
| No browser console logging | MEDIUM | Difficult to diagnose issues | Enable diagnostic logging |
| No dialog lifecycle tracking | MEDIUM | Resource leaks undetected | Add enter/exit logging |
| No backend health monitoring | MEDIUM | Can't correlate backend crashes | Collect backend logs on failure |
Recommendations for Phase 2
Immediate Actions (CRITICAL)
-
Replace ALL
page.waitForTimeout()in certificates.spec.ts (34 instances)- Priority: P0 - Blocking
- Effort: 3 hours
- Impact: Eliminates race conditions
-
Add dialog lifecycle verification to interrupted tests
- Priority: P0 - Blocking
- Effort: 1 hour
- Impact: Ensures proper cleanup
-
Enable diagnostic logging in CI
- Priority: P0 - Blocking
- Effort: 30 minutes
- Impact: Captures root cause on next failure
Short-term Actions (HIGH PRIORITY)
-
Create
wait-helpers.tslibrary- Priority: P1
- Effort: 2 hours
- Impact: Provides drop-in replacements for timeouts
-
Add browser console error detection to CI
- Priority: P1
- Effort: 1 hour
- Impact: Alerts on JavaScript errors during tests
-
Implement pre-commit hook to prevent new timeouts
- Priority: P1
- Effort: 1 hour
- Impact: Prevents regression
Long-term Actions (MEDIUM PRIORITY)
-
Refactor remaining 66 instances of
page.waitForTimeout()- Priority: P2
- Effort: 8-12 hours
- Impact: Consistent wait patterns across all tests
-
Add memory profiling to CI
- Priority: P2
- Effort: 2 hours
- Impact: Detects memory leaks early
-
Create test isolation verification suite
- Priority: P2
- Effort: 3 hours
- Impact: Ensures tests don't contaminate each other
Next Steps
- ✅ Phase 1.1 Complete: Test execution order analyzed
- ✅ Phase 1.2 Complete: Split browser jobs implemented
- ✅ Phase 1.3 Complete: Coverage merge strategy implemented
- ✅ Phase 1.4 Complete: Deep diagnostic investigation documented
- ⏭️ Phase 2.1 Start: Create
wait-helpers.tslibrary - ⏭️ Phase 2.2 Start: Refactor interrupted tests in certificates.spec.ts
Validation Checklist
- Diagnostic logging enabled in certificates.spec.ts
- Local reproduction of interruption attempted
- Trace files analyzed for resource leaks
- Backend logs collected during test run
- Browser console logs captured during interruption
- Hypothesis validated (or refined)
- Phase 2 remediation plan approved
References
- Browser Alignment Diagnostic Report
- Browser Alignment Triage Plan
- Playwright Auto-Waiting Documentation
- Test Isolation Best Practices
Document Control: Version: 1.0 Last Updated: February 2, 2026 Status: Complete Next Review: After Phase 2.1 completion