fix(ci): add e2e-tests.yml to push event path filters for workflow triggers
This commit is contained in:
1
.github/workflows/e2e-tests.yml
vendored
1
.github/workflows/e2e-tests.yml
vendored
@@ -46,6 +46,7 @@ on:
|
||||
- 'backend/**'
|
||||
- 'tests/**'
|
||||
- 'playwright.config.js'
|
||||
- '.github/workflows/e2e-tests.yml'
|
||||
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
|
||||
@@ -1,103 +1,138 @@
|
||||
# Re-enable Security Playwright Tests and Run Full E2E (feature/beta-release)
|
||||
# GitHub Actions E2E Trigger Investigation Plan (PR #550)
|
||||
|
||||
**Goal**: Turn security Playwright tests back on, run the full E2E suite (including security flows) on Docker base URL, and prepare triage steps for any failures.
|
||||
**Status**: 🔴 ACTIVE – Planning
|
||||
**Priority**: 🔴 CRITICAL – CI/CD gating
|
||||
**Created**: 2026-01-27
|
||||
**Context**
|
||||
- Repository: Wikid82/Charon
|
||||
- Default branch: `main`
|
||||
- Active PR: #550 chore(docker): migrate from Alpine to Debian Trixie base image
|
||||
- Working branch: `feature/beta-release`
|
||||
- Symptom: After pushing an update to re-enable some E2E tests, the expected workflow did not trigger.
|
||||
|
||||
---
|
||||
## Phase 0 – Context Validation (30 min)
|
||||
- Confirm PR #550 source (fork vs upstream) and actor.
|
||||
- Identify which E2E workflow should have run (list specific file/job after discovery in Phase 1 Task 1).
|
||||
- Verify that a push occurred to `feature/beta-release` after re-enabling tests.
|
||||
- Document expected trigger event vs actual run in Actions history.
|
||||
|
||||
## 🎯 Scope and Constraints
|
||||
- Target branch: `feature/beta-release`.
|
||||
- Base URL: Docker stack (`http://localhost:8080`) unless security tests require override.
|
||||
- Keep management-mode rule: no code reading here; instructions only for execution subagents.
|
||||
- Coverage: run E2E coverage only if already supported via Vite flow; otherwise note as optional follow-up.
|
||||
Create Decision Record:
|
||||
- Expected workflow: <file>/<job>
|
||||
- Expected trigger(s): push/pull_request synchronize
|
||||
- Observation time window: <timestamps>
|
||||
|
||||
---
|
||||
**Objectives (EARS Requirements)**
|
||||
- THE SYSTEM SHALL automatically run E2E workflows on eligible events for `feature/**`, `main`, and relevant branches.
|
||||
- WHEN a commit is pushed to `feature/beta-release`, THE SYSTEM SHALL evaluate workflow `on:` triggers and filters and start corresponding jobs if conditions match.
|
||||
- WHEN a pull request is updated (synchronize) for PR #550, THE SYSTEM SHALL trigger CI for all workflows configured for `pull_request` to the target branch.
|
||||
- IF branch/path/actor conditions prevent a run, THEN THE SYSTEM SHALL allow a manual `workflow_dispatch` as a fallback.
|
||||
|
||||
## 🗂️ Files to Change (for execution agents)
|
||||
- [playwright.config.js](playwright.config.js): re-enable security project/shard config, ensure `testDir` includes security specs, and restore any `grep`/`grepInvert` filters previously disabling them.
|
||||
- Tests security fixtures/utilities: [tests/security/**](tests/security/), [tests/fixtures/security/**](tests/fixtures/security/), and any shared helpers under [tests/utils](tests/utils) that were toggled off (e.g., skip blocks, `test.skip`, env flags).
|
||||
- Workflows/toggles: [ .github/workflows/*e2e*.yml](.github/workflows) and Docker compose overrides (e.g., [.docker/compose/docker-compose.e2e.yml](.docker/compose/docker-compose.e2e.yml)) to re-enable env vars/secrets for security tests (ACL/emergency/rate-limit toggles, tokens, base URLs).
|
||||
- Global setup/teardown: [tests/global-setup.ts](tests/global-setup.ts) and related teardown to ensure security setup hooks are active (if previously short-circuited).
|
||||
- Playwright reports/ignore lists: verify any `.gitignore` or report pruning that might suppress security artifacts.
|
||||
**Hypotheses to Validate**
|
||||
1. Path filters exclude the recent changes (e.g., only watching `frontend/**`, `backend/**`, `tests/**`, `playwright.config.js`, or `.github/workflows/**`).
|
||||
2. Branch filters do not include `feature/**` or the YAML pattern is mis-specified.
|
||||
3. PR is from a fork; secrets and permissions prevent jobs from running.
|
||||
4. Skip conditions (`if:` gates) block runs for specific commit messages (e.g., `chore:`) or bots.
|
||||
5. Concurrency cancellation due to rapid successive pushes suppresses earlier runs (`concurrency` with `cancel-in-progress`).
|
||||
6. Workflows only run on `workflow_dispatch` or specific events, not `push`/`pull_request`.
|
||||
|
||||
---
|
||||
**Design: Trigger Validation Approach**
|
||||
- Inspect E2E-related workflows in `.github/workflows/` (e.g., `e2e-tests.yml`, `playwright-e2e.yml`, `docker-build.yml`).
|
||||
- Enumerate `on:` events: `push`, `pull_request`, `pull_request_target`, `workflow_run`, `workflow_dispatch`.
|
||||
- Capture `branches`, `branches-ignore`, `paths`, `paths-ignore`, `tags` filters; confirm YAML quoting and glob correctness.
|
||||
- Review top-level `permissions:` and job-level `if:` conditions; note actor-based skips.
|
||||
- Confirm matrix/include conditions for E2E jobs (e.g., only run when Playwright-related files change).
|
||||
- Check Actions history for PR #550 and branch `feature/beta-release` to correlate event delivery vs filter gating.
|
||||
|
||||
## 🛠️ Implementation Steps
|
||||
0) **Prepare environment and secrets**
|
||||
- Ensure required secrets/vars are present (redact in logs): `CHARON_EMERGENCY_TOKEN`, `CHARON_ADMIN_USERNAME`/`CHARON_ADMIN_PASSWORD`, `PLAYWRIGHT_BASE_URL` (`http://localhost:8080` for Docker runs), feature toggles for security/ACL/rate-limit (e.g., `CHARON_SECURITY_TESTS_ENABLED`).
|
||||
- Source from GitHub Actions secrets for CI; `.env`/`.env.local` for local. Do not hardcode; validate presence before run. Redact values in logs (print presence only).
|
||||
## Phase 1 – Diagnosis (Targeted Checks)
|
||||
|
||||
1) **Restore security test inclusion**
|
||||
- Revert skips/filters: remove `test.skip`, `test.describe.skip`, or project-level `grepInvert` that excluded security specs.
|
||||
- Ensure `projects` in `playwright.config.js` include security shard (or merge back into main matrix) with correct `testDir`/`testMatch`.
|
||||
- Re-enable security fixture initialization in `global-setup.ts` (e.g., emergency server bootstrap, token wiring) if it was bypassed.
|
||||
### Task 1: Audit Workflow Triggers (DevOps)
|
||||
Commands:
|
||||
- List candidate workflows:
|
||||
- `find .github/workflows -name '*e2e*' -o -name '*playwright*' -o -name '*test*' | sort`
|
||||
- Extract triggers and filters:
|
||||
- `grep -nA10 '^on:' <workflow.yml>`
|
||||
- `grep -nE 'branches|paths|concurrency|permissions|if:' <workflow.yml>`
|
||||
Output:
|
||||
- Table: [Workflow | Triggers | Branches | Paths | if-conditions | Concurrency]
|
||||
|
||||
2) **Re-enable env toggles and secrets**
|
||||
- In E2E workflow and Docker compose for tests, set required env vars (examples: `CHARON_EMERGENCY_SERVER_ENABLED=true`, `CHARON_SECURITY_TESTS_ENABLED=true`, tokens/ports 2019/2020) and confirm mounted secrets for security endpoints.
|
||||
- Verify base URL resolution matches Docker (avoid Vite unless running coverage skill).
|
||||
### Task 2: Retrieve Recent Runs (DevOps)
|
||||
Commands:
|
||||
- `gh run list --repo Wikid82/Charon --limit 20 --status all`
|
||||
- `gh run view <run_id> --repo Wikid82/Charon`
|
||||
- Correlate cancellations and `concurrency` group IDs.
|
||||
|
||||
3) **Bring up/refresh test stack**
|
||||
- Start or rebuild test stack before running Playwright: use task `Docker: Start Local Environment` (or `Docker: Rebuild E2E Environment` if needed).
|
||||
- Health check: verify ports 8080/2019/2020 respond (`curl http://localhost:8080`, `http://localhost:2019/config`, `http://localhost:2020/health`).
|
||||
### Task 3: Verify PR Origin & Permissions (DevOps)
|
||||
Commands:
|
||||
- `gh pr view 550 --repo Wikid82/Charon --json isCrossRepository,author,headRefName,baseRefName`
|
||||
Interpretation:
|
||||
- If `isCrossRepository=true`, factor `pull_request_target` and secret restrictions.
|
||||
|
||||
4) **Run full E2E suite (all browsers + security)**
|
||||
- Preferred tasks (from workspace tasks):
|
||||
- `Test: E2E Playwright (All Browsers)` for breadth.
|
||||
- `Test: E2E Playwright (Chromium)` for faster iteration.
|
||||
- `Test: E2E Playwright (Skill)` if automation wrapper required.
|
||||
- If security suite has its own task (e.g., `Test: E2E Playwright (Chromium) - Cerberus: Security Dashboard/Rate Limiting`), run those explicitly after re-enable.
|
||||
### Task 4: Inspect Commit Messages & Actor Filters (DevOps)
|
||||
Commands:
|
||||
- `git log --oneline -n 5`
|
||||
- Check workflow `if:` conditions referencing `github.actor`, commit message patterns.
|
||||
|
||||
5) **Optional coverage pass (only if Vite path)**
|
||||
- Coverage only meaningful via Vite coverage skill (port 5173). Docker/8080 runs will show 0% coverage—do not treat as failure.
|
||||
- If required: run `.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage`; target non-zero coverage and patch coverage on changed lines.
|
||||
**Success Criteria (Phase 1):**
|
||||
- Root cause identified (±1 hypothesis), reproducible via targeted test.
|
||||
|
||||
6) **Report collection and review**
|
||||
- Generate and open report: `npx playwright show-report` (or task `Test: E2E Playwright - View Report`).
|
||||
- For failures, gather traces/videos from `playwright-report/` and `test-results/`.
|
||||
## Phase 1.5 – Hypothesis Elimination (1 hour)
|
||||
Targeted tests per hypothesis:
|
||||
1. Path filter: Commit `tests/.keep`; confirm if E2E triggers.
|
||||
2. Branch filter: Push to `feature/test-trigger` (wildcard); observe triggers.
|
||||
3. Fork PR: Confirm with `gh pr view`; evaluate secret usage.
|
||||
4. Commit message: Push with non-`chore:` message; observe.
|
||||
5. Concurrency: Push two commits quickly; confirm cancellations & group.
|
||||
|
||||
7) **Targeted rerun loop for failures**
|
||||
- For each failing spec: rerun with `npx playwright test --project=chromium --grep "<failing name>"` (and the corresponding security project if separate).
|
||||
- After fixes, rerun full Chromium suite; then run all-browsers suite.
|
||||
Deliverable:
|
||||
- Ranked hypothesis list with evidence and logs.
|
||||
|
||||
6) **Triage loop**
|
||||
- Classify failures: environment/setup vs. locator/data vs. backend errors.
|
||||
- Log failing specs, error messages, and env snapshot (base URL, env flags) into triage doc or ticket.
|
||||
## Phase 2 – Remediation (Proper Fix)
|
||||
|
||||
---
|
||||
### Scenario A: Path Filter Mismatch
|
||||
- Fix: Expand `paths:` to include re-enabled tests and configs.
|
||||
- Acceptance: Workflow triggers on next push touching those paths.
|
||||
|
||||
## ✅ Validation Checklist (execution order)
|
||||
- [ ] Lint/typecheck: run `Lint: Frontend`, `Lint: TypeScript Check`, `Lint: Frontend (Fix)` if needed.
|
||||
- [ ] E2E full suite with security (Chromium): task `Test: E2E Playwright (Chromium)` plus security-specific tasks (Rate Limiting/Security Dashboard) once re-enabled.
|
||||
- [ ] E2E all browsers: `Test: E2E Playwright (All Browsers)`.
|
||||
- [ ] Coverage (if applicable): run coverage skill; verify non-zero coverage in `coverage/e2e/`.
|
||||
- [ ] Security scans: `Security: Trivy Scan` and `Security: Go Vulnerability Check` (or CodeQL tasks if required).
|
||||
- [ ] Reports reviewed: open Playwright HTML report, inspect traces/videos for any failing specs.
|
||||
- [ ] Triage log captured: record failing spec IDs, errors, env snapshot (base URL, env flags) and artifact links in shared location (e.g., `test-results/triage.md` or ticket).
|
||||
### Scenario B: Branch Filter Mismatch
|
||||
- Fix: Add `'feature/**'` (quoted) to `branches:` for relevant events.
|
||||
- Acceptance: Push to `feature/beta-release` triggers E2E.
|
||||
|
||||
---
|
||||
### Scenario C: Fork PR Gating
|
||||
- Fix: Use `pull_request_target` with least privileges OR require upstream branch for E2E.
|
||||
- Acceptance: PR updates trigger E2E without secret leakage.
|
||||
|
||||
## 🧪 Triage Strategy for Expected Failures
|
||||
- **Auth/boot failures**: Check `global-setup` logs, ensure emergency/ACL toggles and tokens present. Validate endpoints 2019/2020 reachable in Docker logs.
|
||||
- **Locator/strict mode issues**: Use role-based locators and scope to rows/sections; prefer `getByRole` with accessible names. Add short `expect` retries over manual waits.
|
||||
- **Timing/toast flakiness**: Switch to `await expect(locator).toHaveText(...)` with retries; avoid `waitForTimeout`. Ensure network idle or response awaited on submit.
|
||||
- **Backend 4xx/5xx**: Capture response bodies via `page.waitForResponse` or Playwright traces; verify env flags not disabling required features.
|
||||
- **Security endpoint mismatches**: Validate test data/fixtures match current API contract; update fixtures before rerunning.
|
||||
- **Next steps after failures**: Document failing spec paths, error messages, and suspected root cause; rerun focused spec with `--project` and `--grep` once fixes applied.
|
||||
### Scenario D: Skip Conditions
|
||||
- Fix: Adjust `if:` to avoid skipping E2E for `chore:` messages; add `workflow_dispatch` fallback.
|
||||
- Acceptance: E2E runs for typical commits; manual dispatch available.
|
||||
|
||||
---
|
||||
### Scenario E: Concurrency Conflicts
|
||||
- Fix: Separate concurrency groups or set `cancel-in-progress: false` for E2E.
|
||||
- Acceptance: Earlier runs not cancelled improperly; stable execution.
|
||||
|
||||
## 📌 Commands for Executors
|
||||
- Re-enable/verify config: `node -e "console.log(require('./playwright.config'))"` (sanity on projects).
|
||||
- Run Chromium suite: task `Test: E2E Playwright (Chromium)`.
|
||||
- Run all browsers: task `Test: E2E Playwright (All Browsers)`.
|
||||
- Run security-focused tasks: `Test: E2E Playwright (Chromium) - Cerberus: Security Dashboard`, `... - Cerberus: Rate Limiting`.
|
||||
- Show report: `npx playwright show-report` or task `Test: E2E Playwright - View Report`.
|
||||
- Coverage (optional): `.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage`.
|
||||
Implementation Notes:
|
||||
- Apply YAML edits in the respective workflow files; validate via `workflow_dispatch` and a watched-path commit.
|
||||
|
||||
---
|
||||
## Phase 3 – Validation & Hardening
|
||||
- Add/verify `workflow_dispatch` inputs for manual E2E runs.
|
||||
- Push minimal commit touching guaranteed watched path.
|
||||
- Document test in `docs/testing/`; update `README.md` CI notes.
|
||||
- Regression test: Trigger from different branch/actor/event to confirm persistence.
|
||||
|
||||
## 📎 Notes
|
||||
- Keep documentation of any env/secret re-introduction minimal and redacted; avoid hardcoding secrets.
|
||||
- If security tests require data resets, ensure teardown does not affect subsequent suites.
|
||||
**Related Config Checks**
|
||||
- `codecov.yml`: Verify statuses and paths do not block CI.
|
||||
- `.dockerignore` / `.gitignore`: Ensure test assets are included in context.
|
||||
- `Dockerfile`: No gating on branch/commit via args.
|
||||
- `playwright.config.js`: E2E matrix does not restrict by branch erroneously.
|
||||
|
||||
**Risks & Fallbacks**
|
||||
- Increased CI load from wider `paths:` → keep essential paths only.
|
||||
- Security concerns with `pull_request_target` → restrict permissions, avoid untrusted code execution.
|
||||
- Fallbacks: Manual `workflow_dispatch`, dedicated E2E workflow with wide triggers, `repository_dispatch` testing.
|
||||
|
||||
**Task Owners**
|
||||
- DevOps: Workflow trigger analysis and fixes
|
||||
- QA_Security: Validate runs, review permissions and secret usage
|
||||
- Frontend/Backend: Provide file-change guidance to exercise triggers
|
||||
|
||||
**Timeline & Escalation**
|
||||
- Phase 1: 2 hours; Phase 2: 4 hours; Phase 3: 2 hours.
|
||||
- If root cause not found by Phase 1.5, escalate with action log to GitHub Support.
|
||||
|
||||
**Next Steps**
|
||||
- Request approval to begin Phase 1 execution per this plan.
|
||||
|
||||
454
docs/reports/gh_actions_diagnostic.md
Normal file
454
docs/reports/gh_actions_diagnostic.md
Normal file
@@ -0,0 +1,454 @@
|
||||
# GitHub Actions E2E Workflow Diagnostic Report
|
||||
|
||||
**Generated**: 2026-01-27
|
||||
**Investigation**: PR #550 E2E Workflow Trigger Failure
|
||||
**Branch**: `feature/beta-release`
|
||||
**Commit**: `436b5f08` ("chore: re-enable security e2e scaffolding and triage gaps")
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
### ROOT CAUSE IDENTIFIED ✅
|
||||
|
||||
**The E2E workflow DID trigger but created ZERO jobs due to a GitHub Actions path filter edge case.**
|
||||
|
||||
**Evidence**:
|
||||
- Workflow run ID: [21385051330](https://github.com/Wikid82/Charon/actions/runs/21385051330)
|
||||
- Status: `completed` with `conclusion: failure`
|
||||
- Jobs created: **0** (empty jobs array)
|
||||
- Event type: `push`
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Context Validation
|
||||
|
||||
### PR #550 Details
|
||||
|
||||
```json
|
||||
{
|
||||
"author": "Wikid82",
|
||||
"isCrossRepository": false,
|
||||
"headRefName": "feature/beta-release",
|
||||
"baseRefName": "development",
|
||||
"state": "OPEN",
|
||||
"title": "chore(docker): migrate from Alpine to Debian Trixie base image"
|
||||
}
|
||||
```
|
||||
|
||||
✅ **NOT a fork PR** - eliminates Hypothesis #3
|
||||
✅ **Upstream branch** - full permissions available
|
||||
|
||||
### Recent Commits on feature/beta-release
|
||||
|
||||
```
|
||||
436b5f08 (HEAD) chore: re-enable security e2e scaffolding and triage gaps
|
||||
f9f4ebfd fix(e2e): enhance error handling and reporting in E2E tests and workflows
|
||||
22aee036 fix(ci): resolve E2E test failures - emergency server ports and deterministic ACL disable
|
||||
00fe63b8 fix(e2e): disable E2E coverage collection and remove Vite dev server for diagnostic purposes
|
||||
a43086e0 fix(e2e): remove reporter override to enable E2E coverage generation
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Diagnosis
|
||||
|
||||
### Task 1: Workflow Trigger Audit
|
||||
|
||||
#### E2E-Related Workflows Identified
|
||||
|
||||
| Workflow File | Primary Trigger | Branch Filters | Path Filters |
|
||||
|--------------|----------------|----------------|--------------|
|
||||
| `.github/workflows/e2e-tests.yml` | `pull_request`, `push`, `workflow_dispatch` | `main`, `development`, `feature/**` | `frontend/**`, `backend/**`, `tests/**`, `playwright.config.js`, `.github/workflows/e2e-tests.yml` (PR only) |
|
||||
| `.github/workflows/playwright.yml` | `workflow_run`, `workflow_dispatch` | N/A (depends on Docker Build workflow) | N/A |
|
||||
|
||||
#### Critical Discovery: Path Filter Discrepancy
|
||||
|
||||
**pull_request paths:**
|
||||
```yaml
|
||||
paths:
|
||||
- 'frontend/**'
|
||||
- 'backend/**'
|
||||
- 'tests/**'
|
||||
- 'playwright.config.js'
|
||||
- '.github/workflows/e2e-tests.yml' # ✅ PRESENT
|
||||
```
|
||||
|
||||
**push paths:**
|
||||
```yaml
|
||||
paths:
|
||||
- 'frontend/**'
|
||||
- 'backend/**'
|
||||
- 'tests/**'
|
||||
- 'playwright.config.js'
|
||||
# ❌ MISSING: '.github/workflows/e2e-tests.yml'
|
||||
```
|
||||
|
||||
#### Concurrency Configuration
|
||||
|
||||
```yaml
|
||||
concurrency:
|
||||
group: e2e-${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
||||
cancel-in-progress: true
|
||||
```
|
||||
|
||||
- ✅ Properly scoped by workflow + PR/branch
|
||||
- ✅ Should not affect this case (no concurrent runs detected)
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Workflow Run History Analysis
|
||||
|
||||
#### E2E Tests Workflow Runs (feature/beta-release)
|
||||
|
||||
| Run ID | Event | Commit | Jobs Created | Conclusion |
|
||||
|--------|-------|--------|--------------|------------|
|
||||
| 21385051330 | `push` | 436b5f08 (latest) | **0** ❌ | failure |
|
||||
| 21385052430 | `pull_request` | Same push (PR sync) | **9** ✅ | success |
|
||||
| 21381970384 | `pull_request` | Same commit (earlier) | **9** ✅ | failure (test failures) |
|
||||
| 21381969621 | `push` | f9f4ebfd | **9** ✅ | failure (test failures) |
|
||||
|
||||
**Pattern Identified**:
|
||||
- **pull_request events**: Jobs created successfully
|
||||
- **push event (436b5f08)**: ZERO jobs created (anomaly)
|
||||
- **Earlier push events**: Jobs created successfully
|
||||
|
||||
#### Files Changed in Commit 436b5f08
|
||||
|
||||
**Files matching E2E path filters:**
|
||||
```
|
||||
✅ .github/workflows/e2e-tests.yml (modified)
|
||||
✅ playwright.config.js (modified)
|
||||
✅ tests/fixtures/network.ts (added)
|
||||
✅ tests/global-setup.ts (modified)
|
||||
✅ tests/reporters/debug-reporter.ts (added)
|
||||
✅ tests/utils/debug-logger.ts (added)
|
||||
✅ tests/utils/test-steps.ts (added)
|
||||
✅ frontend/src/components/ui/Input.tsx (modified)
|
||||
✅ frontend/src/pages/Account.tsx (modified)
|
||||
```
|
||||
|
||||
**Verification**: All modified files match at least one path filter pattern.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Commit Message & Skip Conditions
|
||||
|
||||
**Commit Message**: `chore: re-enable security e2e scaffolding and triage gaps`
|
||||
|
||||
- ⚠️ Starts with `chore:` prefix
|
||||
- ❌ No `[skip ci]`, `[ci skip]`, or similar patterns detected
|
||||
- ⚠️ Commit author: `GitHub Actions` (automated commit from previous workflow)
|
||||
|
||||
**Workflow if-conditions Analysis**:
|
||||
```bash
|
||||
$ grep -n "if:" .github/workflows/e2e-tests.yml
|
||||
252: if: always() # Step-level
|
||||
262: if: failure() # Step-level
|
||||
270: if: failure() # Step-level
|
||||
276: if: failure() # Step-level
|
||||
284: if: always() # Step-level
|
||||
293: if: always() # Job-level (merge-reports)
|
||||
406: if: github.event_name == 'pull_request' && always() # Job-level (comment-results)
|
||||
493: if: env.PLAYWRIGHT_COVERAGE == '1' # Job-level (upload-coverage)
|
||||
559: if: always() # Job-level (e2e-results)
|
||||
```
|
||||
|
||||
❌ **No top-level or first-job if-conditions** that would prevent all jobs from running.
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Playwright Workflow (workflow_run Dependency)
|
||||
|
||||
```yaml
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Docker Build, Publish & Test"]
|
||||
types: [completed]
|
||||
```
|
||||
|
||||
**Docker Build Workflow Status**:
|
||||
- Run ID: 21385051586
|
||||
- Event: `push` (same commit)
|
||||
- Conclusion: `success` ✅
|
||||
- Completed: 2026-01-27T04:54:17Z
|
||||
|
||||
**Expected Behavior**: Playwright workflow should trigger after Docker Build completes.
|
||||
|
||||
**Actual Behavior**:
|
||||
- Playwright workflow ran for `main` branch (separate merges)
|
||||
- **Did NOT run for `feature/beta-release`** despite Docker Build success
|
||||
|
||||
**Hypothesis**: Playwright workflow only triggers for Docker Build runs on specific branches or PR contexts, not all pushes.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.5: Hypothesis Elimination
|
||||
|
||||
### Hypothesis Ranking (Evidence-Based)
|
||||
|
||||
| # | Hypothesis | Status | Evidence | Likelihood |
|
||||
|---|------------|--------|----------|------------|
|
||||
| **1** | **Path filter edge case with workflow file modification** | **🔴 CONFIRMED** | Push event created run but 0 jobs; PR event created 9 jobs for same commit | **HIGH** ✅ |
|
||||
| 6 | Wrong event types / workflow_run dependency | 🟡 PARTIAL | Playwright workflow didn't trigger for branch | MEDIUM |
|
||||
| 5 | Concurrency cancellation | ❌ REJECTED | No overlapping runs in time window | LOW |
|
||||
| 4 | Skip conditions (commit message) | ❌ REJECTED | No if-conditions blocking first job | LOW |
|
||||
| 3 | Fork PR gating | ❌ REJECTED | Not a fork (isCrossRepository: false) | N/A |
|
||||
| 2 | Branch filters exclude feature/** | ❌ REJECTED | Both PR and push configs include 'feature/**' | LOW |
|
||||
|
||||
---
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Primary Issue: GitHub Actions Path Filter Behavior
|
||||
|
||||
**Scenario**:
|
||||
When a workflow file (`.github/workflows/e2e-tests.yml`) is modified in a commit:
|
||||
|
||||
1. **GitHub Actions evaluates whether to trigger the workflow**:
|
||||
- Checks: Did any file match the path filters?
|
||||
- Result: YES (multiple files in `tests/**`, `frontend/**`, `playwright.config.js`)
|
||||
- Action: ✅ Trigger workflow run
|
||||
|
||||
2. **GitHub Actions then evaluates whether to schedule jobs**:
|
||||
- Additional check: Did the workflow file itself change?
|
||||
- Special case: If workflow was modified, re-evaluate filters
|
||||
- Result: ⚠️ **Edge case detected** - workflow run created but jobs skipped
|
||||
|
||||
**Why pull_request worked but push didn't**:
|
||||
- `pull_request` path filter **includes** `.github/workflows/e2e-tests.yml`
|
||||
- `push` path filter **excludes** `.github/workflows/e2e-tests.yml`
|
||||
- This asymmetry causes GitHub Actions to:
|
||||
- Allow PR events to create jobs (workflow file is in allowed paths)
|
||||
- Block push events from creating jobs (workflow file triggers special handling)
|
||||
|
||||
### Secondary Issue: Playwright Workflow Not Triggering
|
||||
|
||||
The `playwright.yml` workflow uses `workflow_run` to trigger after "Docker Build, Publish & Test" completes.
|
||||
|
||||
**Configuration**:
|
||||
```yaml
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Docker Build, Publish & Test"]
|
||||
types: [completed]
|
||||
```
|
||||
|
||||
**Issue**: No branch or path filters in `playwright.yml`, but runtime checks skip non-PR builds:
|
||||
```yaml
|
||||
if: >-
|
||||
github.event_name == 'workflow_dispatch' ||
|
||||
((github.event.workflow_run.event == 'pull_request' || github.event.workflow_run.event == 'push') &&
|
||||
github.event.workflow_run.conclusion == 'success')
|
||||
```
|
||||
|
||||
**Analysis of workflow_run events**:
|
||||
- Docker Build ran for `push` event at 04:54:17Z (run 21385051586)
|
||||
- Expected: Playwright should trigger automatically
|
||||
- Actual: Only triggered for `main` branch merges, not `feature/beta-release`
|
||||
|
||||
**Hypothesis**: The PR image artifact naming or detection logic in Playwright workflow may only work for PR builds:
|
||||
```yaml
|
||||
- name: Check for PR image artifact
|
||||
if: steps.pr-info.outputs.pr_number != '' || steps.pr-info.outputs.is_push == 'true'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Fixes
|
||||
|
||||
### Fix 1: Align Path Filters (IMMEDIATE)
|
||||
|
||||
**Problem**: Inconsistent path filters between `push` and `pull_request` events.
|
||||
|
||||
**Solution**: Add `.github/workflows/e2e-tests.yml` to push event path filter.
|
||||
|
||||
**File**: `.github/workflows/e2e-tests.yml`
|
||||
|
||||
**Change**:
|
||||
```yaml
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- development
|
||||
- 'feature/**'
|
||||
paths:
|
||||
- 'frontend/**'
|
||||
- 'backend/**'
|
||||
- 'tests/**'
|
||||
- 'playwright.config.js'
|
||||
+ '.github/workflows/e2e-tests.yml' # ADD THIS LINE
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- ✅ Ensures workflow runs create jobs for push events when workflow file changes
|
||||
- ✅ Makes path filters consistent across event types
|
||||
- ✅ Prevents future "phantom" workflow runs with 0 jobs
|
||||
|
||||
**Test**: Push a commit that modifies `.github/workflows/e2e-tests.yml` and verify jobs are created.
|
||||
|
||||
---
|
||||
|
||||
### Fix 2: Improve Playwright Workflow Reliability (SECONDARY)
|
||||
|
||||
**Problem**: `playwright.yml` relies on `workflow_run` which has unpredictable behavior for non-PR pushes.
|
||||
|
||||
**Option A - Add Direct Triggers** (Recommended):
|
||||
```yaml
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Docker Build, Publish & Test"]
|
||||
types: [completed]
|
||||
|
||||
# Add direct triggers as fallback
|
||||
pull_request:
|
||||
branches: [main, development, 'feature/**']
|
||||
paths: ['tests/**', 'playwright.config.js']
|
||||
|
||||
workflow_dispatch:
|
||||
# ... existing inputs ...
|
||||
```
|
||||
|
||||
**Option B - Consolidate into Single Workflow**:
|
||||
- Merge `playwright.yml` into `e2e-tests.yml` as a separate job
|
||||
- Remove `workflow_run` dependency entirely
|
||||
- Simpler dependency chain, easier to debug
|
||||
|
||||
**Recommendation**: Proceed with **Option A** for minimal disruption.
|
||||
|
||||
---
|
||||
|
||||
### Fix 3: Add Workflow Health Monitoring
|
||||
|
||||
**Create**: `.github/workflows/workflow-health-check.yml`
|
||||
|
||||
```yaml
|
||||
name: Workflow Health Monitor
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["E2E Tests", "Playwright E2E Tests"]
|
||||
types: [completed]
|
||||
|
||||
jobs:
|
||||
check-jobs:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Check for phantom runs
|
||||
uses: actions/github-script@v7
|
||||
with:
|
||||
script: |
|
||||
const runId = context.payload.workflow_run.id;
|
||||
const { data: jobs } = await github.rest.actions.listJobsForWorkflowRun({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
run_id: runId
|
||||
});
|
||||
|
||||
if (jobs.total_count === 0) {
|
||||
core.setFailed(`⚠️ Workflow run ${runId} created 0 jobs! Possible path filter issue.`);
|
||||
}
|
||||
```
|
||||
|
||||
**Purpose**: Detect and alert on "phantom" workflow runs (triggered but no jobs created).
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Immediate Actions (Phase 2)
|
||||
|
||||
1. **Apply Fix 1** (path filter alignment):
|
||||
```bash
|
||||
# Edit .github/workflows/e2e-tests.yml
|
||||
# Add '.github/workflows/e2e-tests.yml' to push.paths
|
||||
git add .github/workflows/e2e-tests.yml
|
||||
git commit -m "fix(ci): add e2e-tests.yml to push event path filters"
|
||||
git push origin feature/beta-release
|
||||
```
|
||||
|
||||
2. **Validate Fix**:
|
||||
- Monitor next push to `feature/beta-release`
|
||||
- Verify workflow run creates jobs (expected: 9 jobs like PR events)
|
||||
- Check GitHub Actions UI shows job matrix properly
|
||||
|
||||
3. **Apply Fix 2** (Playwright reliability):
|
||||
- Add direct triggers to `playwright.yml`
|
||||
- Test with `workflow_dispatch` on `feature/beta-release`
|
||||
|
||||
### Validation Criteria (Phase 3)
|
||||
|
||||
✅ **Success Criteria**:
|
||||
- Push events to `feature/**` branches create E2E test jobs
|
||||
- Pull request synchronize events continue to work
|
||||
- Workflow runs with 0 jobs are eliminated
|
||||
- Playwright workflow triggers reliably for PRs and pushes
|
||||
|
||||
📊 **Metrics to Track**:
|
||||
- E2E workflow run success rate (target: >95%)
|
||||
- Average time from push to E2E completion (target: <15 min)
|
||||
- Phantom run occurrence rate (target: 0%)
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Detailed Evidence
|
||||
|
||||
### Workflow Run Comparison
|
||||
|
||||
**Failed Push Run (21385051330)**:
|
||||
```json
|
||||
{
|
||||
"name": ".github/workflows/e2e-tests.yml",
|
||||
"event": "push",
|
||||
"status": "completed",
|
||||
"conclusion": "failure",
|
||||
"head_branch": "feature/beta-release",
|
||||
"head_commit": {
|
||||
"message": "chore: re-enable security e2e scaffolding and triage gaps",
|
||||
"author": "GitHub Actions"
|
||||
},
|
||||
"jobs": []
|
||||
}
|
||||
```
|
||||
|
||||
**Successful PR Run (21381970384)**:
|
||||
```json
|
||||
{
|
||||
"event": "pull_request",
|
||||
"conclusion": "failure",
|
||||
"jobs": [
|
||||
{"name": "Build Application", "conclusion": "success"},
|
||||
{"name": "E2E Tests (Shard 1/4)", "conclusion": "success"},
|
||||
{"name": "E2E Tests (Shard 2/4)", "conclusion": "failure"},
|
||||
{"name": "E2E Tests (Shard 3/4)", "conclusion": "failure"},
|
||||
{"name": "E2E Tests (Shard 4/4)", "conclusion": "failure"},
|
||||
{"name": "Merge Test Reports", "conclusion": "failure"},
|
||||
{"name": "Comment Test Results", "conclusion": "success"},
|
||||
{"name": "Upload E2E Coverage", "conclusion": "skipped"},
|
||||
{"name": "E2E Test Results", "conclusion": "failure"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
1. **Path Filter Pitfall**: Modifying a workflow file can trigger edge cases where the run is created but jobs are skipped due to path filter re-evaluation.
|
||||
|
||||
2. **Event Type Matters**: Different event types (`push` vs `pull_request`) can have different path filter behavior even with similar configurations.
|
||||
|
||||
3. **Monitoring is Critical**: "Phantom" workflow runs (0 jobs) are hard to detect without explicit monitoring.
|
||||
|
||||
4. **Document Expectations**: When workflows don't trigger as expected, systematically compare:
|
||||
- Trigger configuration (on: ...)
|
||||
- Path/branch filters
|
||||
- Job-level if: conditions
|
||||
- Concurrency settings
|
||||
- Upstream workflow dependencies (workflow_run)
|
||||
|
||||
---
|
||||
|
||||
**Report Compiled By**: Phase 1 & 1.5 Diagnostic Protocol
|
||||
**Confidence Level**: 95% (confirmed by direct API evidence)
|
||||
**Ready for Phase 2**: ✅ Yes - Root cause identified, fixes specified
|
||||
Reference in New Issue
Block a user