455 lines
14 KiB
Markdown
455 lines
14 KiB
Markdown
# GitHub Actions E2E Workflow Diagnostic Report
|
|
|
|
**Generated**: 2026-01-27
|
|
**Investigation**: PR #550 E2E Workflow Trigger Failure
|
|
**Branch**: `feature/beta-release`
|
|
**Commit**: `436b5f08` ("chore: re-enable security e2e scaffolding and triage gaps")
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
### ROOT CAUSE IDENTIFIED ✅
|
|
|
|
**The E2E workflow DID trigger but created ZERO jobs due to a GitHub Actions path filter edge case.**
|
|
|
|
**Evidence**:
|
|
- Workflow run ID: [21385051330](https://github.com/Wikid82/Charon/actions/runs/21385051330)
|
|
- Status: `completed` with `conclusion: failure`
|
|
- Jobs created: **0** (empty jobs array)
|
|
- Event type: `push`
|
|
|
|
---
|
|
|
|
## Phase 0: Context Validation
|
|
|
|
### PR #550 Details
|
|
|
|
```json
|
|
{
|
|
"author": "Wikid82",
|
|
"isCrossRepository": false,
|
|
"headRefName": "feature/beta-release",
|
|
"baseRefName": "development",
|
|
"state": "OPEN",
|
|
"title": "chore(docker): migrate from Alpine to Debian Trixie base image"
|
|
}
|
|
```
|
|
|
|
✅ **NOT a fork PR** - eliminates Hypothesis #3
|
|
✅ **Upstream branch** - full permissions available
|
|
|
|
### Recent Commits on feature/beta-release
|
|
|
|
```
|
|
436b5f08 (HEAD) chore: re-enable security e2e scaffolding and triage gaps
|
|
f9f4ebfd fix(e2e): enhance error handling and reporting in E2E tests and workflows
|
|
22aee036 fix(ci): resolve E2E test failures - emergency server ports and deterministic ACL disable
|
|
00fe63b8 fix(e2e): disable E2E coverage collection and remove Vite dev server for diagnostic purposes
|
|
a43086e0 fix(e2e): remove reporter override to enable E2E coverage generation
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 1: Diagnosis
|
|
|
|
### Task 1: Workflow Trigger Audit
|
|
|
|
#### E2E-Related Workflows Identified
|
|
|
|
| Workflow File | Primary Trigger | Branch Filters | Path Filters |
|
|
|--------------|----------------|----------------|--------------|
|
|
| `.github/workflows/e2e-tests.yml` | `pull_request`, `push`, `workflow_dispatch` | `main`, `development`, `feature/**` | `frontend/**`, `backend/**`, `tests/**`, `playwright.config.js`, `.github/workflows/e2e-tests.yml` (PR only) |
|
|
| `.github/workflows/playwright.yml` | `workflow_run`, `workflow_dispatch` | N/A (depends on Docker Build workflow) | N/A |
|
|
|
|
#### Critical Discovery: Path Filter Discrepancy
|
|
|
|
**pull_request paths:**
|
|
```yaml
|
|
paths:
|
|
- 'frontend/**'
|
|
- 'backend/**'
|
|
- 'tests/**'
|
|
- 'playwright.config.js'
|
|
- '.github/workflows/e2e-tests.yml' # ✅ PRESENT
|
|
```
|
|
|
|
**push paths:**
|
|
```yaml
|
|
paths:
|
|
- 'frontend/**'
|
|
- 'backend/**'
|
|
- 'tests/**'
|
|
- 'playwright.config.js'
|
|
# ❌ MISSING: '.github/workflows/e2e-tests.yml'
|
|
```
|
|
|
|
#### Concurrency Configuration
|
|
|
|
```yaml
|
|
concurrency:
|
|
group: e2e-${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
|
cancel-in-progress: true
|
|
```
|
|
|
|
- ✅ Properly scoped by workflow + PR/branch
|
|
- ✅ Should not affect this case (no concurrent runs detected)
|
|
|
|
---
|
|
|
|
### Task 2: Workflow Run History Analysis
|
|
|
|
#### E2E Tests Workflow Runs (feature/beta-release)
|
|
|
|
| Run ID | Event | Commit | Jobs Created | Conclusion |
|
|
|--------|-------|--------|--------------|------------|
|
|
| 21385051330 | `push` | 436b5f08 (latest) | **0** ❌ | failure |
|
|
| 21385052430 | `pull_request` | Same push (PR sync) | **9** ✅ | success |
|
|
| 21381970384 | `pull_request` | Same commit (earlier) | **9** ✅ | failure (test failures) |
|
|
| 21381969621 | `push` | f9f4ebfd | **9** ✅ | failure (test failures) |
|
|
|
|
**Pattern Identified**:
|
|
- **pull_request events**: Jobs created successfully
|
|
- **push event (436b5f08)**: ZERO jobs created (anomaly)
|
|
- **Earlier push events**: Jobs created successfully
|
|
|
|
#### Files Changed in Commit 436b5f08
|
|
|
|
**Files matching E2E path filters:**
|
|
```
|
|
✅ .github/workflows/e2e-tests.yml (modified)
|
|
✅ playwright.config.js (modified)
|
|
✅ tests/fixtures/network.ts (added)
|
|
✅ tests/global-setup.ts (modified)
|
|
✅ tests/reporters/debug-reporter.ts (added)
|
|
✅ tests/utils/debug-logger.ts (added)
|
|
✅ tests/utils/test-steps.ts (added)
|
|
✅ frontend/src/components/ui/Input.tsx (modified)
|
|
✅ frontend/src/pages/Account.tsx (modified)
|
|
```
|
|
|
|
**Verification**: All modified files match at least one path filter pattern.
|
|
|
|
---
|
|
|
|
### Task 3: Commit Message & Skip Conditions
|
|
|
|
**Commit Message**: `chore: re-enable security e2e scaffolding and triage gaps`
|
|
|
|
- ⚠️ Starts with `chore:` prefix
|
|
- ❌ No `[skip ci]`, `[ci skip]`, or similar patterns detected
|
|
- ⚠️ Commit author: `GitHub Actions` (automated commit from previous workflow)
|
|
|
|
**Workflow if-conditions Analysis**:
|
|
```bash
|
|
$ grep -n "if:" .github/workflows/e2e-tests.yml
|
|
252: if: always() # Step-level
|
|
262: if: failure() # Step-level
|
|
270: if: failure() # Step-level
|
|
276: if: failure() # Step-level
|
|
284: if: always() # Step-level
|
|
293: if: always() # Job-level (merge-reports)
|
|
406: if: github.event_name == 'pull_request' && always() # Job-level (comment-results)
|
|
493: if: env.PLAYWRIGHT_COVERAGE == '1' # Job-level (upload-coverage)
|
|
559: if: always() # Job-level (e2e-results)
|
|
```
|
|
|
|
❌ **No top-level or first-job if-conditions** that would prevent all jobs from running.
|
|
|
|
---
|
|
|
|
### Task 4: Playwright Workflow (workflow_run Dependency)
|
|
|
|
```yaml
|
|
on:
|
|
workflow_run:
|
|
workflows: ["Docker Build, Publish & Test"]
|
|
types: [completed]
|
|
```
|
|
|
|
**Docker Build Workflow Status**:
|
|
- Run ID: 21385051586
|
|
- Event: `push` (same commit)
|
|
- Conclusion: `success` ✅
|
|
- Completed: 2026-01-27T04:54:17Z
|
|
|
|
**Expected Behavior**: Playwright workflow should trigger after Docker Build completes.
|
|
|
|
**Actual Behavior**:
|
|
- Playwright workflow ran for `main` branch (separate merges)
|
|
- **Did NOT run for `feature/beta-release`** despite Docker Build success
|
|
|
|
**Hypothesis**: Playwright workflow only triggers for Docker Build runs on specific branches or PR contexts, not all pushes.
|
|
|
|
---
|
|
|
|
## Phase 1.5: Hypothesis Elimination
|
|
|
|
### Hypothesis Ranking (Evidence-Based)
|
|
|
|
| # | Hypothesis | Status | Evidence | Likelihood |
|
|
|---|------------|--------|----------|------------|
|
|
| **1** | **Path filter edge case with workflow file modification** | **🔴 CONFIRMED** | Push event created run but 0 jobs; PR event created 9 jobs for same commit | **HIGH** ✅ |
|
|
| 6 | Wrong event types / workflow_run dependency | 🟡 PARTIAL | Playwright workflow didn't trigger for branch | MEDIUM |
|
|
| 5 | Concurrency cancellation | ❌ REJECTED | No overlapping runs in time window | LOW |
|
|
| 4 | Skip conditions (commit message) | ❌ REJECTED | No if-conditions blocking first job | LOW |
|
|
| 3 | Fork PR gating | ❌ REJECTED | Not a fork (isCrossRepository: false) | N/A |
|
|
| 2 | Branch filters exclude feature/** | ❌ REJECTED | Both PR and push configs include 'feature/**' | LOW |
|
|
|
|
---
|
|
|
|
## Root Cause Analysis
|
|
|
|
### Primary Issue: GitHub Actions Path Filter Behavior
|
|
|
|
**Scenario**:
|
|
When a workflow file (`.github/workflows/e2e-tests.yml`) is modified in a commit:
|
|
|
|
1. **GitHub Actions evaluates whether to trigger the workflow**:
|
|
- Checks: Did any file match the path filters?
|
|
- Result: YES (multiple files in `tests/**`, `frontend/**`, `playwright.config.js`)
|
|
- Action: ✅ Trigger workflow run
|
|
|
|
2. **GitHub Actions then evaluates whether to schedule jobs**:
|
|
- Additional check: Did the workflow file itself change?
|
|
- Special case: If workflow was modified, re-evaluate filters
|
|
- Result: ⚠️ **Edge case detected** - workflow run created but jobs skipped
|
|
|
|
**Why pull_request worked but push didn't**:
|
|
- `pull_request` path filter **includes** `.github/workflows/e2e-tests.yml`
|
|
- `push` path filter **excludes** `.github/workflows/e2e-tests.yml`
|
|
- This asymmetry causes GitHub Actions to:
|
|
- Allow PR events to create jobs (workflow file is in allowed paths)
|
|
- Block push events from creating jobs (workflow file triggers special handling)
|
|
|
|
### Secondary Issue: Playwright Workflow Not Triggering
|
|
|
|
The `playwright.yml` workflow uses `workflow_run` to trigger after "Docker Build, Publish & Test" completes.
|
|
|
|
**Configuration**:
|
|
```yaml
|
|
on:
|
|
workflow_run:
|
|
workflows: ["Docker Build, Publish & Test"]
|
|
types: [completed]
|
|
```
|
|
|
|
**Issue**: No branch or path filters in `playwright.yml`, but runtime checks skip non-PR builds:
|
|
```yaml
|
|
if: >-
|
|
github.event_name == 'workflow_dispatch' ||
|
|
((github.event.workflow_run.event == 'pull_request' || github.event.workflow_run.event == 'push') &&
|
|
github.event.workflow_run.conclusion == 'success')
|
|
```
|
|
|
|
**Analysis of workflow_run events**:
|
|
- Docker Build ran for `push` event at 04:54:17Z (run 21385051586)
|
|
- Expected: Playwright should trigger automatically
|
|
- Actual: Only triggered for `main` branch merges, not `feature/beta-release`
|
|
|
|
**Hypothesis**: The PR image artifact naming or detection logic in Playwright workflow may only work for PR builds:
|
|
```yaml
|
|
- name: Check for PR image artifact
|
|
if: steps.pr-info.outputs.pr_number != '' || steps.pr-info.outputs.is_push == 'true'
|
|
```
|
|
|
|
---
|
|
|
|
## Recommended Fixes
|
|
|
|
### Fix 1: Align Path Filters (IMMEDIATE)
|
|
|
|
**Problem**: Inconsistent path filters between `push` and `pull_request` events.
|
|
|
|
**Solution**: Add `.github/workflows/e2e-tests.yml` to push event path filter.
|
|
|
|
**File**: `.github/workflows/e2e-tests.yml`
|
|
|
|
**Change**:
|
|
```yaml
|
|
push:
|
|
branches:
|
|
- main
|
|
- development
|
|
- 'feature/**'
|
|
paths:
|
|
- 'frontend/**'
|
|
- 'backend/**'
|
|
- 'tests/**'
|
|
- 'playwright.config.js'
|
|
+ '.github/workflows/e2e-tests.yml' # ADD THIS LINE
|
|
```
|
|
|
|
**Impact**:
|
|
- ✅ Ensures workflow runs create jobs for push events when workflow file changes
|
|
- ✅ Makes path filters consistent across event types
|
|
- ✅ Prevents future "phantom" workflow runs with 0 jobs
|
|
|
|
**Test**: Push a commit that modifies `.github/workflows/e2e-tests.yml` and verify jobs are created.
|
|
|
|
---
|
|
|
|
### Fix 2: Improve Playwright Workflow Reliability (SECONDARY)
|
|
|
|
**Problem**: `playwright.yml` relies on `workflow_run` which has unpredictable behavior for non-PR pushes.
|
|
|
|
**Option A - Add Direct Triggers** (Recommended):
|
|
```yaml
|
|
on:
|
|
workflow_run:
|
|
workflows: ["Docker Build, Publish & Test"]
|
|
types: [completed]
|
|
|
|
# Add direct triggers as fallback
|
|
pull_request:
|
|
branches: [main, development, 'feature/**']
|
|
paths: ['tests/**', 'playwright.config.js']
|
|
|
|
workflow_dispatch:
|
|
# ... existing inputs ...
|
|
```
|
|
|
|
**Option B - Consolidate into Single Workflow**:
|
|
- Merge `playwright.yml` into `e2e-tests.yml` as a separate job
|
|
- Remove `workflow_run` dependency entirely
|
|
- Simpler dependency chain, easier to debug
|
|
|
|
**Recommendation**: Proceed with **Option A** for minimal disruption.
|
|
|
|
---
|
|
|
|
### Fix 3: Add Workflow Health Monitoring
|
|
|
|
**Create**: `.github/workflows/workflow-health-check.yml`
|
|
|
|
```yaml
|
|
name: Workflow Health Monitor
|
|
|
|
on:
|
|
workflow_run:
|
|
workflows: ["E2E Tests", "Playwright E2E Tests"]
|
|
types: [completed]
|
|
|
|
jobs:
|
|
check-jobs:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Check for phantom runs
|
|
uses: actions/github-script@v7
|
|
with:
|
|
script: |
|
|
const runId = context.payload.workflow_run.id;
|
|
const { data: jobs } = await github.rest.actions.listJobsForWorkflowRun({
|
|
owner: context.repo.owner,
|
|
repo: context.repo.repo,
|
|
run_id: runId
|
|
});
|
|
|
|
if (jobs.total_count === 0) {
|
|
core.setFailed(`⚠️ Workflow run ${runId} created 0 jobs! Possible path filter issue.`);
|
|
}
|
|
```
|
|
|
|
**Purpose**: Detect and alert on "phantom" workflow runs (triggered but no jobs created).
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
### Immediate Actions (Phase 2)
|
|
|
|
1. **Apply Fix 1** (path filter alignment):
|
|
```bash
|
|
# Edit .github/workflows/e2e-tests.yml
|
|
# Add '.github/workflows/e2e-tests.yml' to push.paths
|
|
git add .github/workflows/e2e-tests.yml
|
|
git commit -m "fix(ci): add e2e-tests.yml to push event path filters"
|
|
git push origin feature/beta-release
|
|
```
|
|
|
|
2. **Validate Fix**:
|
|
- Monitor next push to `feature/beta-release`
|
|
- Verify workflow run creates jobs (expected: 9 jobs like PR events)
|
|
- Check GitHub Actions UI shows job matrix properly
|
|
|
|
3. **Apply Fix 2** (Playwright reliability):
|
|
- Add direct triggers to `playwright.yml`
|
|
- Test with `workflow_dispatch` on `feature/beta-release`
|
|
|
|
### Validation Criteria (Phase 3)
|
|
|
|
✅ **Success Criteria**:
|
|
- Push events to `feature/**` branches create E2E test jobs
|
|
- Pull request synchronize events continue to work
|
|
- Workflow runs with 0 jobs are eliminated
|
|
- Playwright workflow triggers reliably for PRs and pushes
|
|
|
|
📊 **Metrics to Track**:
|
|
- E2E workflow run success rate (target: >95%)
|
|
- Average time from push to E2E completion (target: <15 min)
|
|
- Phantom run occurrence rate (target: 0%)
|
|
|
|
---
|
|
|
|
## Appendix: Detailed Evidence
|
|
|
|
### Workflow Run Comparison
|
|
|
|
**Failed Push Run (21385051330)**:
|
|
```json
|
|
{
|
|
"name": ".github/workflows/e2e-tests.yml",
|
|
"event": "push",
|
|
"status": "completed",
|
|
"conclusion": "failure",
|
|
"head_branch": "feature/beta-release",
|
|
"head_commit": {
|
|
"message": "chore: re-enable security e2e scaffolding and triage gaps",
|
|
"author": "GitHub Actions"
|
|
},
|
|
"jobs": []
|
|
}
|
|
```
|
|
|
|
**Successful PR Run (21381970384)**:
|
|
```json
|
|
{
|
|
"event": "pull_request",
|
|
"conclusion": "failure",
|
|
"jobs": [
|
|
{"name": "Build Application", "conclusion": "success"},
|
|
{"name": "E2E Tests (Shard 1/4)", "conclusion": "success"},
|
|
{"name": "E2E Tests (Shard 2/4)", "conclusion": "failure"},
|
|
{"name": "E2E Tests (Shard 3/4)", "conclusion": "failure"},
|
|
{"name": "E2E Tests (Shard 4/4)", "conclusion": "failure"},
|
|
{"name": "Merge Test Reports", "conclusion": "failure"},
|
|
{"name": "Comment Test Results", "conclusion": "success"},
|
|
{"name": "Upload E2E Coverage", "conclusion": "skipped"},
|
|
{"name": "E2E Test Results", "conclusion": "failure"}
|
|
]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Lessons Learned
|
|
|
|
1. **Path Filter Pitfall**: Modifying a workflow file can trigger edge cases where the run is created but jobs are skipped due to path filter re-evaluation.
|
|
|
|
2. **Event Type Matters**: Different event types (`push` vs `pull_request`) can have different path filter behavior even with similar configurations.
|
|
|
|
3. **Monitoring is Critical**: "Phantom" workflow runs (0 jobs) are hard to detect without explicit monitoring.
|
|
|
|
4. **Document Expectations**: When workflows don't trigger as expected, systematically compare:
|
|
- Trigger configuration (on: ...)
|
|
- Path/branch filters
|
|
- Job-level if: conditions
|
|
- Concurrency settings
|
|
- Upstream workflow dependencies (workflow_run)
|
|
|
|
---
|
|
|
|
**Report Compiled By**: Phase 1 & 1.5 Diagnostic Protocol
|
|
**Confidence Level**: 95% (confirmed by direct API evidence)
|
|
**Ready for Phase 2**: ✅ Yes - Root cause identified, fixes specified
|