chore: clean .gitignore cache

This commit is contained in:
GitHub Actions
2026-01-26 19:21:33 +00:00
parent 1b1b3a70b1
commit e5f0fec5db
1483 changed files with 0 additions and 472793 deletions

View File

@@ -1,149 +0,0 @@
# Security Remediation Plan — DoD Failures (CodeQL + Trivy)
**Created:** 2026-01-09
This plan addresses the **HIGH/CRITICAL security findings** reported in [docs/reports/qa_report.md](docs/reports/qa_report.md).
> The prior Codecov patch-coverage plan was moved to [docs/plans/patch_coverage_spec.md](docs/plans/patch_coverage_spec.md).
## Goal
Restore DoD to ✅ PASS by eliminating **all HIGH/CRITICAL** findings from:
- CodeQL (Go + JS) results produced by **Security: CodeQL All (CI-Aligned)**
- Trivy results produced by **Security: Trivy Scan**
Hard constraints:
- Do **not** weaken gates (no suppressing findings unless a false-positive is proven and documented).
- Prefer minimal, targeted changes.
- Avoid adding new runtime dependencies.
## Scope
From the QA report:
### CodeQL Go
- Rule: `go/email-injection` (**CRITICAL**)
- Location: `backend/internal/services/mail_service.go` (reported around lines ~222, ~340, ~393)
### CodeQL JS
- Rule: `js/incomplete-hostname-regexp` (**HIGH**)
- Location: `frontend/src/pages/__tests__/ProxyHosts-extra.test.tsx` (reported around line ~252)
### Trivy
QA report note: Trivy filesystem scan may be picking up **workspace caches/artifacts** (e.g., `.cache/go/pkg/mod/...` and other generated directories) in addition to repo-tracked files, while the **image scan may already be clean**.
## Step 0 — Trivy triage (required first)
Objective: Re-run the current Trivy task and determine whether HIGH/CRITICAL findings are attributable to:
- **Repo-tracked paths** (e.g., `backend/go.mod`, `backend/go.sum`, `Dockerfile`, `frontend/`, etc.), or
- **Generated/cache paths** under the workspace (e.g., `.cache/`, `**/*.cover`, `codeql-db-*`, temporary build outputs).
Steps:
1. Run **Security: Trivy Scan**.
2. For each HIGH/CRITICAL item, record the affected file path(s) reported by Trivy.
3. Classify each finding:
- **Repo-tracked**: path is under version control (or clearly part of the shipped build artifact, e.g., the built `app/charon` binary or image layers).
- **Scan-scope noise**: path is a workspace cache/artifact directory not intended as deliverable input.
Decision outcomes:
- If HIGH/CRITICAL are **repo-tracked / shipped** → remediate by upgrading only the affected components to Trivys fixed versions (see Workstreams C/D).
- If HIGH/CRITICAL are **only cache/artifact paths** → treat as scan-scope noise and align Trivy scan scope to repo contents by excluding those directories (without disabling scanners or suppressing findings).
## Workstreams (by role)
### Workstream A — Backend (Backend_Dev): Fix `go/email-injection`
Objective: Ensure no untrusted data can inject additional headers/body content into SMTP `DATA`.
Implementation direction (minimal + CodeQL-friendly):
1. **Centralize email header construction** (avoid raw `fmt.Sprintf("%s: %s\r\n", ...)` with untrusted input).
2. **Reject** header values containing `\r` or `\n` (and other control characters if feasible).
3. Ensure email addresses are created using strict parsing/formatting (`net/mail`) and avoid concatenating raw address strings.
4. Add unit tests that attempt CRLF injection in subject/from/to and assert the send/build path rejects it.
Acceptance criteria:
- CodeQL Go scan shows **0** `go/email-injection` findings.
- Backend unit tests cover the rejection paths.
### Workstream B — Frontend (Frontend_Dev): Fix `js/incomplete-hostname-regexp`
Objective: Remove an “incomplete hostname regex” pattern flagged by CodeQL.
Preferred change:
- Replace hostname regex usage with an exact string match (or an anchored + escaped regex like `^link\.example\.com$`).
Acceptance criteria:
- CodeQL JS scan shows **0** `js/incomplete-hostname-regexp` findings.
### Workstream C — Container / embedded binaries (DevOps): Fix Trivy image finding
Objective: Ensure the built image does not ship `crowdsec`/`cscli` binaries that embed vulnerable `github.com/expr-lang/expr v1.17.2`.
Implementation direction:
1. If any changes are made to `Dockerfile` (including the CrowdSec build stage), rebuild the image (**no-cache recommended**) before validating.
2. Prefer **bumping the pinned CrowdSec version** in `Dockerfile` to a release that already depends on `expr >= 1.17.7`.
3. If no suitable CrowdSec release is available, patch the build in the CrowdSec build stage similarly to the existing Caddy stage override (force `expr@1.17.7` before building).
Acceptance criteria:
- Trivy image scan reports **0 HIGH/CRITICAL**.
### Workstream D — Go module upgrades (Backend_Dev + QA_Security): Fix Trivy repo scan findings
Objective: Eliminate Trivy filesystem-scan HIGH/CRITICAL findings without over-upgrading unrelated dependencies.
Implementation direction (conditional; driven by Step 0 triage):
1. If Trivy attributes HIGH/CRITICAL to `backend/go.mod` / `backend/go.sum` **or** to the built `app/charon` binary:
- Bump **only the specific Go modules Trivy flags** to Trivys fixed versions.
- Run `go mod tidy` and ensure builds/tests stay green.
1. If Trivy attributes HIGH/CRITICAL **only** to workspace caches / generated artifacts (e.g., `.cache/go/pkg/mod/...`):
- Treat as scan-scope noise and align Trivys filesystem scan scope to repo-tracked content by excluding those directories.
- This is **not** gate weakening: scanners stay enabled and the project must still achieve **0 HIGH/CRITICAL** in Trivy outputs.
Acceptance criteria:
- Trivy scan reports **0 HIGH/CRITICAL**.
## Validation (VS Code tasks)
Run tasks in this order (only run frontend ones if Workstream B changes anything under `frontend/`):
1. **Build: Backend**
2. **Test: Backend with Coverage**
3. **Security: CodeQL All (CI-Aligned)**
4. **Security: Trivy Scan** (explicitly verify **both** filesystem-scan and image-scan outputs are **0 HIGH/CRITICAL**)
5. **Lint: Pre-commit (All Files)**
If any changes are made to `Dockerfile` / CrowdSec build stage:
1. **Build & Run: Local Docker Image No-Cache** (recommended)
2. **Security: Trivy Scan** (re-verify image scan after rebuild)
If `frontend/` changes are made:
1. **Lint: TypeScript Check**
2. **Test: Frontend with Coverage**
3. **Lint: Frontend**
## Handoff checklist
- Attach updated `codeql-results-*.sarif` and Trivy artifacts for **both filesystem and image** outputs to the QA rerun.
- Confirm the QA reports pass/fail criteria are satisfied (no HIGH/CRITICAL findings).

View File

@@ -1,628 +0,0 @@
# Resolution Plan: GitHub Advanced Security / Trivy Workflow Configuration Warning
**Document Version:** 1.0
**Date:** 2026-01-11
**Status:** Analysis Complete - Ready for Implementation
---
## Executive Summary
GitHub Advanced Security is reporting that 2 workflow configurations from `refs/heads/main` are missing in the current PR branch (`feature/beta-release`):
1. `.github/workflows/security-weekly-rebuild.yml:security-rebuild`
2. `.github/workflows/docker-publish.yml:build-and-push`
**Root Cause:** `.github/workflows/docker-publish.yml` was **deleted** from the repository in commit `f640524b` on December 21, 2025. The file was renamed/replaced by `.github/workflows/docker-build.yml`, but the job name `build-and-push` remains the same. This creates a **false positive** warning because GitHub Advanced Security is tracking the old filename.
**Impact:** This is a **LOW SEVERITY** issue - it's primarily a tracking/reporting problem, not a functional security gap. All Trivy scanning functionality is intact in `docker-build.yml`.
---
## Investigation Summary
### 1. File State Analysis
#### Current Branch (`feature/beta-release`)
```
✅ .github/workflows/security-weekly-rebuild.yml EXISTS
- Job name: security-rebuild
- Configured for: schedule, workflow_dispatch
- Includes: Trivy scanning with SARIF upload
- Status: ✅ ACTIVE
❌ .github/workflows/docker-publish.yml DOES NOT EXIST
- File was deleted in commit f640524b (Dec 21, 2025)
- Reason: Replaced by docker-build.yml
✅ .github/workflows/docker-build.yml EXISTS (replacement)
- Job name: build-and-push (SAME as docker-publish)
- Configured for: push, pull_request, workflow_dispatch, workflow_call
- Includes: Trivy scanning with SARIF upload
- Status: ✅ ACTIVE
```
#### Main Branch (`refs/heads/main`)
```
✅ .github/workflows/security-weekly-rebuild.yml EXISTS
- Job name: security-rebuild
- IDENTICAL to feature/beta-release
❌ .github/workflows/docker-publish.yml DOES NOT EXIST
- Also deleted on main branch (commit f640524b is on main)
✅ .github/workflows/docker-build.yml EXISTS
- Job name: build-and-push
- IDENTICAL to feature/beta-release (with minor version updates)
```
### 2. Git History Analysis
```bash
# Commit that removed docker-publish.yml
commit f640524baaf9770aa49f6bd01c5bde04cd50526c
Author: GitHub Actions <actions@github.com>
Date: Sun Dec 21 15:11:25 2025 +0000
chore: remove docker-publish workflow file
.github/workflows/docker-publish.yml | 283 deletions(-)
```
**Key Findings:**
- `docker-publish.yml` was deleted on **BOTH** main and feature/beta-release branches
- `docker-build.yml` exists on **BOTH** branches with the **SAME** job name
- The warning is a GitHub Advanced Security tracking artifact from when `docker-publish.yml` existed
- Both branches contain the commit `f640524b` that removed the file
### 3. Job Configuration Comparison
| Configuration | docker-publish.yml (deleted) | docker-build.yml (current) |
|--------------|------------------------------|----------------------------|
| **Job Name** | `build-and-push` | `build-and-push` ✅ SAME |
| **Trivy Scan** | ✅ Yes (SARIF upload) | ✅ Yes (SARIF upload) |
| **Triggers** | push, pull_request, workflow_dispatch | push, pull_request, workflow_dispatch, workflow_call |
| **PR Support** | ✅ Yes | ✅ Yes |
| **SBOM Generation** | ❌ No | ✅ Yes (NEW in docker-build) |
| **CVE Verification** | ❌ No | ✅ Yes (NEW: CVE-2025-68156 check) |
| **Concurrency Control** | ✅ Yes | ✅ Yes (ENHANCED) |
**Improvement Analysis:** `docker-build.yml` is **MORE SECURE** than the deleted `docker-publish.yml`:
- Added SBOM generation (supply chain security)
- Added SBOM attestation with cryptographic signing
- Added CVE-2025-68156 verification for Caddy
- Enhanced timeout controls for integration tests
- Improved PR image handling with `load` parameter
### 4. Security Scanning Coverage Analysis
#### ✅ Trivy Coverage is COMPLETE
| Workflow | Job | Trivy Scan | SARIF Upload | Runs On |
|----------|-----|------------|--------------|---------|
| `security-weekly-rebuild.yml` | `security-rebuild` | ✅ Yes | ✅ Yes | Schedule (weekly), Manual |
| `docker-build.yml` | `build-and-push` | ✅ Yes | ✅ Yes | Push, PR, Manual |
| `docker-build.yml` | `trivy-pr-app-only` | ✅ Yes (app binary) | ❌ No | PR only |
**Coverage Assessment:**
- Weekly security rebuilds: ✅ ACTIVE
- Per-commit scanning: ✅ ACTIVE
- PR-specific scanning: ✅ ACTIVE
- SARIF upload to Security tab: ✅ ACTIVE
- **NO SECURITY GAPS IDENTIFIED**
---
## Root Cause Analysis
### Why is GitHub Advanced Security Reporting This Warning?
**Symptom:** GitHub Advanced Security tracks workflow configurations by **filename + job name**. When a workflow file is deleted/renamed, GitHub Security's internal tracking doesn't automatically update the reference mapping.
**Root Cause Chain:**
1. `docker-publish.yml` existed on main branch (tracked as `docker-publish.yml:build-and-push`)
2. Commit `f640524b` deleted `docker-publish.yml` and functionality was moved to `docker-build.yml`
3. GitHub Security still has historical tracking data for `docker-publish.yml:build-and-push`
4. When analyzing feature/beta-release, GitHub Security looks for the OLD filename
5. File not found → Warning generated
**Why This is a False Positive:**
- The job name `build-and-push` still exists in `docker-build.yml`
- All Trivy scanning functionality is preserved (and enhanced)
- Both branches have the same state (file deleted, functionality moved)
- The warning is about **filename tracking**, not missing security functionality
### Why Was docker-publish.yml Deleted?
Based on git history and inspection:
1. **Consolidation:** Functionality was merged/improved in `docker-build.yml`
2. **Enhancement:** `docker-build.yml` added SBOM, attestation, and CVE checks
3. **Maintenance:** Reduced workflow file duplication
4. **Commit Author:** GitHub Actions bot (automated/scheduled cleanup)
---
## Resolution Strategy
### Option 1: Do Nothing (RECOMMENDED)
**Rationale:** This is a **false positive tracking issue**, not a functional security problem.
**Pros:**
- No code changes required
- No risk of breaking existing functionality
- Security coverage is complete and enhanced
- Warning will eventually clear when GitHub Security updates its tracking
**Cons:**
- Warning remains visible in GitHub Security UI
- May confuse reviewers/auditors
**Recommendation:****ACCEPT THIS OPTION** - Document the issue and proceed.
---
### Option 2: Force GitHub Security to Update Tracking
**Approach:** Trigger a manual re-scan or workflow dispatch on main branch to refresh GitHub Security's workflow registry.
**Steps:**
1. Navigate to Actions → `security-weekly-rebuild.yml`
2. Click "Run workflow" → Run on main branch
3. Wait for workflow completion
4. Check if GitHub Security updates its tracking
**Pros:**
- May clear the warning faster
- No code changes required
**Cons:**
- No guarantee GitHub Security will update tracking immediately
- May need to wait for GitHub's internal cache/indexing to refresh
- Uses CI/CD resources
**Recommendation:** ⚠️ **TRY IF WARNING PERSISTS** - Low risk, low effort troubleshooting step.
---
### Option 3: Re-create docker-publish.yml as a Wrapper (NOT RECOMMENDED)
**Approach:** Create a new `docker-publish.yml` that calls `docker-build.yml` via `workflow_call`.
**Example Implementation:**
```yaml
# .github/workflows/docker-publish.yml
name: Docker Publish (Deprecated - Use docker-build.yml)
on:
workflow_dispatch:
jobs:
build-and-push:
uses: ./.github/workflows/docker-build.yml
```
**Pros:**
- Satisfies GitHub Security's filename tracking
- Maintains backward compatibility for any external references
**Cons:**
- ❌ Creates unnecessary file duplication
- ❌ Adds maintenance burden
- ❌ Confuses future developers (two files doing the same thing)
- ❌ Doesn't solve the root cause (tracking lag)
- ❌ May trigger duplicate builds if configured incorrectly
**Recommendation:****AVOID** - This is symptom patching, not root cause resolution.
---
### Option 4: Add Comprehensive Documentation
**Approach:** Document the workflow file rename/migration in repository documentation.
**Implementation:**
1. Update `CHANGELOG.md` with entry for docker-publish.yml removal
2. Add section to `SECURITY.md` explaining current Trivy coverage
3. Create `.github/workflows/README.md` documenting workflow structure
4. Add comment to `docker-build.yml` explaining it replaced `docker-publish.yml`
**Pros:**
- ✅ Improves project documentation
- ✅ Helps future maintainers understand the change
- ✅ Provides audit trail for security reviews
- ✅ No functional changes, zero risk
**Cons:**
- Doesn't clear the GitHub Security warning
- Requires documentation updates
**Recommendation:****IMPLEMENT THIS** - Valuable regardless of warning resolution.
---
## Recommended Action Plan
### Phase 1: Documentation (IMMEDIATE)
**Objective:** Create audit trail and improve project documentation.
**Tasks:**
1. ✅ Create this plan document (`docs/plans/GITHUB_SECURITY_WARNING_RESOLUTION_PLAN.md`) ← DONE
2. Add entry to `CHANGELOG.md`:
```markdown
### Changed
- Replaced `.github/workflows/docker-publish.yml` with `.github/workflows/docker-build.yml` for enhanced supply chain security
- Added SBOM generation and attestation
- Added CVE-2025-68156 verification for Caddy
- Job name `build-and-push` preserved for continuity
```
3. Add section to `SECURITY.md`:
```markdown
## Security Scanning Coverage
Charon uses Trivy for comprehensive vulnerability scanning:
- **Weekly Scans:** `.github/workflows/security-weekly-rebuild.yml`
- Fresh rebuild without cache
- Scans base images and all dependencies
- Runs every Sunday at 02:00 UTC
- **Per-Commit Scans:** `.github/workflows/docker-build.yml`
- Scans on every push to main, development, feature/beta-release
- Includes pull request scanning
- SARIF upload to GitHub Security tab
All Trivy results are uploaded to the [Security tab](../../security/code-scanning).
```
4. Add header comment to `docker-build.yml`:
```yaml
# This workflow replaced docker-publish.yml on 2025-12-21
# Enhancement: Added SBOM generation, attestation, and CVE verification
# Job name 'build-and-push' preserved for continuity
```
**Estimated Time:** 30 minutes
**Risk:** None
**Priority:** High
---
### Phase 2: Verification (AFTER DOCUMENTATION)
**Objective:** Confirm that security scanning is functioning correctly.
**Tasks:**
1. Verify `security-weekly-rebuild.yml` is scheduled correctly:
```bash
git show main:.github/workflows/security-weekly-rebuild.yml | grep -A 5 "schedule:"
```
2. Check recent workflow runs in GitHub Actions UI:
- Verify `docker-build.yml` runs on push/PR
- Verify `security-weekly-rebuild.yml` runs weekly
- Check for Trivy scan failures
3. Verify SARIF uploads in Security → Code Scanning:
- Check for Trivy results
- Verify scan frequency
- Check for any missed scans
**Success Criteria:**
- ✅ All workflows show successful runs
- ✅ Trivy SARIF results appear in Security tab
- ✅ No scan failures in last 30 days
- ✅ Weekly security rebuild running on schedule
**Estimated Time:** 15 minutes
**Risk:** None (read-only verification)
**Priority:** Medium
---
### Phase 3: Monitor (ONGOING)
**Objective:** Track if GitHub Security warning clears naturally.
**Tasks:**
1. Check PR status page weekly for warning persistence
2. If warning persists after 4 weeks, try Option 2 (manual workflow dispatch)
3. If warning persists after 8 weeks, open GitHub Support ticket
**Success Criteria:**
- Warning clears within 4-8 weeks as GitHub Security updates tracking
**Estimated Time:** 5 minutes/week
**Risk:** None
**Priority:** Low
---
## Risk Assessment
### Current State Risk Analysis
| Risk Category | Severity | Likelihood | Mitigation Status |
|--------------|----------|------------|-------------------|
| **Missing Security Scans** | NONE | 0% | ✅ MITIGATED - All scans active |
| **False Positive Warning** | LOW | 100% | ⚠️ ACCEPTED - Tracking lag |
| **Audit Confusion** | LOW | 30% | ✅ MITIGATED - This document |
| **Workflow Duplication** | NONE | 0% | ✅ AVOIDED - Single source of truth |
| **Breaking Changes** | NONE | 0% | ✅ AVOIDED - No code changes planned |
### Impact Analysis
**If We Do Nothing:**
- Security scanning: ✅ UNAFFECTED (fully functional)
- Code quality: ✅ UNAFFECTED
- Developer experience: ✅ UNAFFECTED
- GitHub Security UI: ⚠️ Shows warning (cosmetic issue only)
- Compliance audits: ✅ PASS (coverage is complete, documented)
**If We Implement Phase 1 (Documentation):**
- Security scanning: ✅ UNAFFECTED
- Code quality: ✅ IMPROVED (better documentation)
- Developer experience: ✅ IMPROVED (clearer history)
- GitHub Security UI: ⚠️ Shows warning (unchanged)
- Compliance audits: ✅✅ PASS (explicit audit trail)
---
## Technical Details
### Workflow File Comparison
#### security-weekly-rebuild.yml
```yaml
name: Weekly Security Rebuild
on:
schedule:
- cron: '0 2 * * 0' # Sundays at 02:00 UTC
workflow_dispatch:
jobs:
security-rebuild: # ← Job name tracked by GitHub Security
# Builds fresh image without cache
# Runs comprehensive Trivy scan
# Uploads SARIF to Security tab
```
#### docker-build.yml (current)
```yaml
name: Docker Build, Publish & Test
on:
push:
branches: [main, development, feature/beta-release]
pull_request:
branches: [main, development, feature/beta-release]
workflow_dispatch:
workflow_call:
jobs:
build-and-push: # ← SAME job name as deleted docker-publish.yml
# Builds and pushes Docker image
# Runs Trivy scan (table + SARIF)
# Generates SBOM and attestation (NEW)
# Verifies CVE-2025-68156 patch (NEW)
# Uploads SARIF to Security tab
```
#### docker-publish.yml (DELETED on 2025-12-21)
```yaml
name: Docker Build, Publish & Test # ← Same name as docker-build.yml
on:
push:
branches: [main, development, feature/beta-release]
pull_request:
branches: [main, development, feature/beta-release]
workflow_dispatch:
workflow_call:
jobs:
build-and-push: # ← Job name preserved in docker-build.yml
# Builds and pushes Docker image
# Runs Trivy scan (table + SARIF)
# Uploads SARIF to Security tab
```
**Migration Notes:**
- ✅ Job name `build-and-push` preserved for continuity
- ✅ All Trivy functionality preserved
- ✅ Enhanced with SBOM generation and attestation
- ✅ Enhanced with CVE verification
- ✅ Improved PR handling with `load` parameter
---
## Dependencies
### Files to Review/Update (Phase 1)
- [ ] `CHANGELOG.md` - Add entry for workflow migration
- [ ] `SECURITY.md` - Document security scanning coverage
- [ ] `.github/workflows/docker-build.yml` - Add header comment
- [ ] `.github/workflows/README.md` - Create workflow documentation (optional)
### No Changes Required (Already Compliant)
- ✅ `.gitignore` - No new files/folders added
- ✅ `.dockerignore` - No Docker changes
- ✅ `.codecov.yml` - No coverage changes
- ✅ Workflow files (no functional changes)
---
## Success Criteria
### Phase 1 Success (Documentation)
- [x] Plan document created and comprehensive
- [x] Root cause identified (workflow file renamed)
- [x] Security coverage verified (all scans active)
- [ ] `CHANGELOG.md` updated with workflow migration entry
- [ ] `SECURITY.md` updated with security scanning documentation
- [ ] `docker-build.yml` has header comment explaining migration
- [ ] All documentation changes reviewed and merged
- [ ] No linting or formatting errors
### Phase 2 Success (Verification)
- [ ] All workflows show successful recent runs
- [ ] Trivy SARIF results visible in Security tab
- [ ] No scan failures in last 30 days
- [ ] Weekly security rebuild on schedule
### Phase 3 Success (Monitoring)
- [ ] GitHub Security warning tracked weekly
- [ ] Warning clears within 8 weeks OR GitHub Support ticket opened
- [ ] No functional issues with security scanning
---
## Alternative Considerations
### Why Not Fix the "Warning" Immediately?
**Considered Approaches:**
1. **Re-create docker-publish.yml as wrapper**
- ❌ Creates maintenance burden
- ❌ Doesn't solve root cause
- ❌ Confuses future developers
2. **Rename docker-build.yml back to docker-publish.yml**
- ❌ Loses git history context
- ❌ Breaks external references to docker-build.yml
- ❌ Cosmetic fix for a cosmetic issue
3. **Contact GitHub Support**
- ⚠️ Time-consuming
- ⚠️ May not prioritize (low severity)
- ⚠️ Should be last resort after monitoring
**Selected Approach: Document and Monitor**
- ✅ Zero risk to existing functionality
- ✅ Improves project documentation
- ✅ Provides audit trail
- ✅ Respects principle: "Don't patch symptoms without understanding root cause"
---
## Questions and Answers
### Q: Is this a security vulnerability?
**A:** No. This is a tracking/reporting issue in GitHub Advanced Security's workflow registry. All security scanning functionality is active and enhanced compared to the deleted workflow.
### Q: Will this block merging the PR?
**A:** No. GitHub Advanced Security warnings are informational and do not block merges. The warning indicates a tracking discrepancy, not a functional security gap.
### Q: Should we re-create docker-publish.yml?
**A:** No. Re-creating the file would be symptom patching and create maintenance burden. The functionality exists in `docker-build.yml` with enhancements.
### Q: How long will the warning persist?
**A:** Unknown. It depends on GitHub's internal tracking cache refresh cycle. Typically, these warnings clear within 4-8 weeks as GitHub's systems update. If it persists beyond 8 weeks, we can escalate to GitHub Support.
### Q: Does this affect compliance audits?
**A:** No. This document provides a complete audit trail showing:
1. Security scanning coverage is complete
2. Functionality was enhanced, not reduced
3. The warning is a false positive from filename tracking
4. All Trivy scans are active and uploading to Security tab
### Q: What if reviewers question the warning?
**A:** Point them to this document which provides:
1. Complete investigation summary
2. Root cause analysis
3. Risk assessment (LOW severity, tracking issue only)
4. Verification that all security scanning is active
---
## Conclusion
**Finding:** The GitHub Advanced Security warning about missing workflow configurations is a **FALSE POSITIVE** caused by workflow file renaming. The file `.github/workflows/docker-publish.yml` was deleted and replaced by `.github/workflows/docker-build.yml` with the same job name (`build-and-push`) and enhanced functionality.
**Security Status:** ✅ **NO SECURITY GAPS** - All Trivy scanning is active, functional, and enhanced compared to the deleted workflow.
**Recommended Action:**
1. ✅ **Implement Phase 1** - Document the migration (30 minutes, zero risk)
2. ✅ **Implement Phase 2** - Verify scanning functionality (15 minutes, read-only)
3. ✅ **Implement Phase 3** - Monitor warning status (5 min/week, optional escalation)
**Merge Recommendation:** ✅ **SAFE TO MERGE** - This is a cosmetic tracking issue, not a functional security problem. Documentation updates provide audit trail and improve project clarity.
**Priority:** LOW - This is a reporting/tracking issue, not a security vulnerability.
**Estimated Total Effort:** 45 minutes + ongoing monitoring
---
## References
### Git Commits
- `f640524b` - Removed docker-publish.yml (Dec 21, 2025)
- `e58fcb71` - Created docker-build.yml (initial)
- `8311d68d` - Updated docker-build.yml buildx action (latest)
### Workflow Files
- `.github/workflows/security-weekly-rebuild.yml` - Weekly security rebuild
- `.github/workflows/docker-build.yml` - Current build and publish workflow
- `.github/workflows/docker-publish.yml` - DELETED (replaced by docker-build.yml)
### Documentation
- GitHub Advanced Security: <https://docs.github.com/en/code-security>
- Trivy Scanner: <https://github.com/aquasecurity/trivy>
- SARIF Format: <https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning>
---
**Plan Status:** ✅ READY FOR IMPLEMENTATION
**Review Required:** Yes (for Phase 1 documentation changes)
**Merge Blocker:** No (safe to proceed with merge)

View File

@@ -1,105 +0,0 @@
# Current Specification
**Status**: Ready for next task
**Last Updated**: 2026-01-11 04:20:00 UTC
---
## Active Projects
*No active projects*
---
## Recently Completed
### Docs-to-Issues Workflow Fix (2026-01-11) ✅
Successfully resolved issue where PR status checks didn't appear when docs-to-issues workflow ran.
**Documentation:**
- **Implementation Summary**: [docs/implementation/DOCS_TO_ISSUES_FIX_2026-01-11.md](../implementation/DOCS_TO_ISSUES_FIX_2026-01-11.md)
- **QA Report**: [docs/reports/qa_docs_to_issues_workflow_fix.md](../reports/qa_docs_to_issues_workflow_fix.md)
- **Archived Plan**: [docs/plans/archive/docs_to_issues_workflow_fix_2026-01-11.md](archive/docs_to_issues_workflow_fix_2026-01-11.md)
**Status**: ✅ Complete - Ready for merge
---
### CI/CD Workflow Fixes (2026-01-11) ✅
**Status:** Complete - All documentation finalized
The CI workflow investigation and documentation has been completed. Both issues were determined to be false positives or expected GitHub behavior with no security gaps.
**Final Documentation:**
- **Implementation Summary**: [docs/implementation/CI_WORKFLOW_FIXES_2026-01-11.md](../implementation/CI_WORKFLOW_FIXES_2026-01-11.md)
- **QA Report**: [docs/reports/qa_report.md](../reports/qa_report.md)
- **Archived Plan**: [docs/plans/archive/GITHUB_SECURITY_WARNING_RESOLUTION_PLAN_2026-01-11.md](archive/GITHUB_SECURITY_WARNING_RESOLUTION_PLAN_2026-01-11.md)
**Merge Status:** ✅ SAFE TO MERGE - Zero security gaps, fully documented
---
### Workflow Orchestration Fix (2026-01-11)
Successfully fixed workflow orchestration issue where supply-chain-verify was running before docker-build completed, causing verification to skip on PRs.
**Documentation:**
- **Implementation Summary**: [docs/implementation/WORKFLOW_ORCHESTRATION_FIX.md](../implementation/WORKFLOW_ORCHESTRATION_FIX.md)
- **QA Report**: [docs/reports/qa_report_workflow_orchestration.md](../reports/qa_report_workflow_orchestration.md)
- **Archived Plan**: [docs/plans/archive/workflow_orchestration_fix_2026-01-11.md](archive/workflow_orchestration_fix_2026-01-11.md)
**Status**: ✅ Complete - Deployed to production
---
### Grype SBOM Remediation (2026-01-10)
Successfully resolved CI/CD failures in the Supply Chain Verification workflow caused by Grype SBOM format mismatch.
**Documentation:**
- **Implementation Summary**: [docs/implementation/GRYPE_SBOM_REMEDIATION.md](../implementation/GRYPE_SBOM_REMEDIATION.md)
- **QA Report**: [docs/reports/qa_report.md](../reports/qa_report.md)
- **Archived Plan**: [docs/plans/archive/grype_sbom_remediation_2026-01-10.md](archive/grype_sbom_remediation_2026-01-10.md)
**Status**: ✅ Complete - Deployed to production
---
## Guidelines for Creating New Specs
When starting a new project, create a detailed specification in this file following the [Spec-Driven Workflow v1](.github/instructions/spec-driven-workflow-v1.instructions.md) format.
### Required Sections
1. **Problem Statement** - What issue are we solving?
2. **Root Cause Analysis** - Why does the problem exist?
3. **Solution Design** - How will we solve it?
4. **Implementation Plan** - Step-by-step tasks
5. **Testing Strategy** - How will we validate success?
6. **Success Criteria** - What defines "done"?
### Archiving Completed Specs
When a specification is complete:
1. Create implementation summary in `docs/implementation/`
2. Move spec to `docs/plans/archive/` with timestamp
3. Update this file with completion notice
---
## Archive Location
Completed and archived specifications can be found in:
- [docs/plans/archive/](archive/)
---
**Note**: This file should only contain ONE active specification at a time. Archive completed work before starting new projects.

View File

@@ -1,824 +0,0 @@
# Remediation Plan: Grype SBOM Format Mismatch (PR #461)
**Status**: Active
**Created**: 2026-01-10
**Priority**: High
**Related Issue**: GitHub Actions failure in supply-chain-verify.yml
**Error**: `ERROR failed to catalog: unable to decode sbom: sbom format not recognized`
---
## Executive Summary
The Grype vulnerability scanner is failing with "sbom format not recognized" error in the Supply Chain Verification workflow. Investigation reveals a **format mismatch** between SBOM generation and consumption, combined with inadequate validation.
**Root Cause**: The workflow generates an SPDX-JSON format SBOM, but the SBOM file may be empty/corrupted when the Docker image doesn't exist yet (common in PR workflows). Grype fails to parse empty or malformed SBOM files.
**Impact**: Supply chain security verification is not functioning correctly, potentially allowing vulnerable images to pass through CI/CD.
---
## Root Cause Analysis
### Problem Statement
CI/CD pipeline fails at vulnerability scanning:
\`\`\`
ERROR failed to catalog: unable to decode sbom: sbom format not recognized
⚠️ Grype scan failed
\`\`\`
### Investigation Findings
#### 1. SBOM Generation (supply-chain-verify.yml:63)
\`\`\`yaml
syft ${IMAGE} -o spdx-json > sbom-generated.json || {
echo "⚠️ Failed to generate SBOM - image may not exist yet"
exit 0
}
\`\`\`
**Issues**:
- Generates SBOM in **SPDX-JSON** format
- Error handling exits with code 0, masking failures
- Empty or malformed file may be created if image doesn't exist
- No validation of SBOM content after generation
#### 2. SBOM Consumption (supply-chain-verify.yml:90)
\`\`\`yaml
grype sbom:sbom-generated.json -o json > vuln-scan.json || {
echo "⚠️ Grype scan failed"
exit 0
}
\`\`\`
**Issues**:
- Assumes SBOM file is valid without checking
- Fails if SBOM is empty, corrupted, or malformed
- Error is suppressed with `exit 0`
#### 3. Format Inconsistency
- **docker-build.yml** (line 242): Generates **CycloneDX-JSON**
- **supply-chain-verify.yml** (line 63): Generates **SPDX-JSON**
- Different formats used in different workflows
#### 4. Timing/Race Condition
- Verification workflow runs on PRs before image exists
- Attempts to pull `ghcr.io/{owner}/charon:pr-{number}`
- Image may not be built yet, causing SBOM generation to fail
- Empty file created, later causes Grype to fail
#### 5. Missing Validation
- Line 85 only checks file existence: `if [[ ! -f sbom-generated.json ]]`
- No check for:
- File size (non-empty)
- Valid JSON structure
- Required SBOM fields (bomFormat, components, etc.)
### Supported Formats (Anchore Documentation)
**Grype** supports:
- Syft JSON (native format)
- SPDX JSON/XML
- CycloneDX JSON/XML
**Syft** outputs:
- Syft JSON
- SPDX JSON/XML
- CycloneDX JSON/XML
- GitHub JSON, SARIF, table, etc.
**Conclusion**: Both SPDX-JSON and CycloneDX-JSON are valid. The issue is **empty/corrupted files**, not format incompatibility.
---
## Affected Components
### Workflows
| File | Lines | Issue |
|------|-------|-------|
| `.github/workflows/supply-chain-verify.yml` | 63 | SBOM generation (SPDX format) |
| `.github/workflows/supply-chain-verify.yml` | 85-95 | Grype scan (no validation) |
| `.github/workflows/docker-build.yml` | 238-252 | SBOM generation (CycloneDX format) |
### Root Causes Summary
| Issue | Impact | Severity |
|-------|--------|----------|
| Empty SBOM file from missing image | Grype fails to parse | **Critical** |
| Missing SBOM content validation | Invalid files passed to Grype | **High** |
| Inconsistent SBOM format usage | Confusion, maintenance burden | Medium |
| Poor error handling (`exit 0`) | Failures masked, hard to debug | **High** |
| Race condition (PR image timing) | Frequent false failures | **High** |
---
## Remediation Strategy
### Recommended Approach: Hybrid Fix
Combine format standardization, validation, and conditional execution.
**Phase 1** (Immediate - 2-4 hours):
1. Standardize on **CycloneDX-JSON** format (aligns with docker-build.yml)
2. Add image existence check before SBOM generation
3. Add comprehensive SBOM validation before Grype scan
4. Improve error handling and logging
5. Skip gracefully when image doesn't exist
**Phase 2** (Future enhancement - 4-8 hours):
- Retrieve attested SBOM from registry instead of regenerating
- Eliminates duplication and ensures consistency
---
## Implementation Plan
### File: `.github/workflows/supply-chain-verify.yml`
#### Change 1: Add Image Existence Check
**Location**: After "Determine Image Tag" step (after line 54)
\`\`\`yaml
- name: Check Image Availability
id: image-check
env:
IMAGE: ghcr.io/${{ github.repository_owner }}/charon:${{ steps.tag.outputs.tag }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
echo "Checking if image exists: ${IMAGE}"
if docker manifest inspect ${IMAGE} >/dev/null 2>&1; then
echo "✅ Image exists and is accessible"
echo "exists=true" >> $GITHUB_OUTPUT
else
echo "⚠️ Image not found - likely not built yet"
echo "This is normal for PR workflows before docker-build completes"
echo "exists=false" >> $GITHUB_OUTPUT
fi
\`\`\`
#### Change 2: Standardize SBOM Format
**Location**: Line 63
**Before**:
\`\`\`yaml
syft ${IMAGE} -o spdx-json > sbom-generated.json || {
\`\`\`
**After**:
\`\`\`yaml
syft ${IMAGE} -o cyclonedx-json > sbom-generated.json || {
\`\`\`
**Rationale**: Aligns with docker-build.yml and is the most widely used format.
#### Change 3: Add Conditional Execution
**Location**: Line 55 (Verify SBOM Completeness step)
**Before**:
\`\`\`yaml
- name: Verify SBOM Completeness
env:
IMAGE: ghcr.io/${{ github.repository_owner }}/charon:${{ steps.tag.outputs.tag }}
\`\`\`
**After**:
\`\`\`yaml
- name: Verify SBOM Completeness
if: steps.image-check.outputs.exists == 'true'
env:
IMAGE: ghcr.io/${{ github.repository_owner }}/charon:${{ steps.tag.outputs.tag }}
\`\`\`
#### Change 4: Add SBOM Validation Step
**Location**: New step after "Verify SBOM Completeness" (after line 77)
\`\`\`yaml
- name: Validate SBOM File
id: validate-sbom
if: steps.image-check.outputs.exists == 'true'
run: |
echo "Validating SBOM file..."
# Check file exists
if [[ ! -f sbom-generated.json ]]; then
echo "❌ SBOM file does not exist"
echo "valid=false" >> $GITHUB_OUTPUT
exit 0
fi
# Check file is non-empty
if [[ ! -s sbom-generated.json ]]; then
echo "❌ SBOM file is empty"
echo "valid=false" >> $GITHUB_OUTPUT
exit 0
fi
# Validate JSON structure
if ! jq empty sbom-generated.json 2>/dev/null; then
echo "❌ SBOM file contains invalid JSON"
cat sbom-generated.json
echo "valid=false" >> $GITHUB_OUTPUT
exit 0
fi
# Validate CycloneDX structure
BOMFORMAT=$(jq -r '.bomFormat // "missing"' sbom-generated.json)
SPECVERSION=$(jq -r '.specVersion // "missing"' sbom-generated.json)
COMPONENTS=$(jq '.components // [] | length' sbom-generated.json)
echo "SBOM Format: ${BOMFORMAT}"
echo "Spec Version: ${SPECVERSION}"
echo "Components: ${COMPONENTS}"
if [[ "${BOMFORMAT}" != "CycloneDX" ]]; then
echo "❌ Invalid bomFormat: expected 'CycloneDX', got '${BOMFORMAT}'"
echo "valid=false" >> $GITHUB_OUTPUT
exit 0
fi
if [[ "${COMPONENTS}" == "0" ]]; then
echo "⚠️ SBOM has no components - may indicate incomplete scan"
echo "valid=partial" >> $GITHUB_OUTPUT
else
echo "✅ SBOM is valid with ${COMPONENTS} components"
echo "valid=true" >> $GITHUB_OUTPUT
fi
\`\`\`
#### Change 5: Update Vulnerability Scan Step
**Location**: Lines 81-103 (replace entire "Scan for Vulnerabilities" step)
\`\`\`yaml
- name: Scan for Vulnerabilities
if: steps.validate-sbom.outputs.valid == 'true'
env:
IMAGE: ghcr.io/${{ github.repository_owner }}/charon:${{ steps.tag.outputs.tag }}
run: |
echo "Scanning for vulnerabilities with Grype..."
echo "SBOM format: CycloneDX JSON"
echo "SBOM size: $(wc -c < sbom-generated.json) bytes"
echo ""
# Run Grype with explicit path and better error handling
if ! grype sbom:./sbom-generated.json --output json --file vuln-scan.json; then
echo ""
echo "❌ Grype scan failed"
echo ""
echo "Debug information:"
echo "Grype version:"
grype version
echo ""
echo "SBOM preview (first 1000 characters):"
head -c 1000 sbom-generated.json
echo ""
exit 1 # Fail the step to surface the issue
fi
echo "✅ Grype scan completed successfully"
echo ""
# Display human-readable results
echo "Vulnerability summary:"
grype sbom:./sbom-generated.json --output table || true
# Parse and categorize results
CRITICAL=$(jq '[.matches[] | select(.vulnerability.severity == "Critical")] | length' vuln-scan.json 2>/dev/null || echo "0")
HIGH=$(jq '[.matches[] | select(.vulnerability.severity == "High")] | length' vuln-scan.json 2>/dev/null || echo "0")
MEDIUM=$(jq '[.matches[] | select(.vulnerability.severity == "Medium")] | length' vuln-scan.json 2>/dev/null || echo "0")
LOW=$(jq '[.matches[] | select(.vulnerability.severity == "Low")] | length' vuln-scan.json 2>/dev/null || echo "0")
echo ""
echo "Vulnerability counts:"
echo " Critical: ${CRITICAL}"
echo " High: ${HIGH}"
echo " Medium: ${MEDIUM}"
echo " Low: ${LOW}"
# Set warnings for critical vulnerabilities
if [[ ${CRITICAL} -gt 0 ]]; then
echo "::warning::${CRITICAL} critical vulnerabilities found"
fi
# Store for PR comment
echo "CRITICAL_VULNS=${CRITICAL}" >> $GITHUB_ENV
echo "HIGH_VULNS=${HIGH}" >> $GITHUB_ENV
echo "MEDIUM_VULNS=${MEDIUM}" >> $GITHUB_ENV
echo "LOW_VULNS=${LOW}" >> $GITHUB_ENV
- name: Report Skipped Scan
if: steps.image-check.outputs.exists != 'true' || steps.validate-sbom.outputs.valid != 'true'
run: |
echo "## ⚠️ Vulnerability Scan Skipped" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
if [[ "${{ steps.image-check.outputs.exists }}" != "true" ]]; then
echo "**Reason**: Docker image not available yet" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "This is expected for PR workflows. The image will be scanned" >> $GITHUB_STEP_SUMMARY
echo "after it's built by the docker-build workflow." >> $GITHUB_STEP_SUMMARY
elif [[ "${{ steps.validate-sbom.outputs.valid }}" != "true" ]]; then
echo "**Reason**: SBOM validation failed" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Check the 'Validate SBOM File' step for details." >> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY
echo "✅ Workflow completed successfully (scan skipped)" >> $GITHUB_STEP_SUMMARY
\`\`\`
#### Change 6: Update PR Comment
**Location**: Lines 107-122 (replace entire "Comment on PR" step)
\`\`\`yaml
- name: Comment on PR
if: github.event_name == 'pull_request'
uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1
with:
script: |
const imageExists = '${{ steps.image-check.outputs.exists }}' === 'true';
const sbomValid = '${{ steps.validate-sbom.outputs.valid }}';
const critical = process.env.CRITICAL_VULNS || '0';
const high = process.env.HIGH_VULNS || '0';
const medium = process.env.MEDIUM_VULNS || '0';
const low = process.env.LOW_VULNS || '0';
let body = '## 🔒 Supply Chain Verification\n\n';
if (!imageExists) {
body += '⏭️ **Status**: Image not yet available\n\n';
body += 'Verification will run automatically after the docker-build workflow completes.\n';
body += 'This is normal for PR workflows.\n';
} else if (sbomValid !== 'true') {
body += '⚠️ **Status**: SBOM validation failed\n\n';
body += `[Check workflow logs for details](${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId})\n`;
} else {
body += '✅ **Status**: SBOM verified and scanned\n\n';
body += '### Vulnerability Summary\n\n';
body += `| Severity | Count |\n`;
body += `|----------|-------|\n`;
body += `| Critical | ${critical} |\n`;
body += `| High | ${high} |\n`;
body += `| Medium | ${medium} |\n`;
body += `| Low | ${low} |\n\n`;
if (parseInt(critical) > 0) {
body += `⚠️ **Action Required**: ${critical} critical vulnerabilities found\n\n`;
}
body += `[View full report](${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId})\n`;
}
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: body
});
\`\`\`
---
## Testing Strategy
### Pre-Deployment Testing
#### 1. Local SBOM Generation and Validation
\`\`\`bash
# Test SBOM generation with existing image
docker pull ghcr.io/wikid82/charon:latest
# Generate SBOM in CycloneDX format
syft ghcr.io/wikid82/charon:latest -o cyclonedx-json > test-sbom.json
# Validate JSON structure
jq empty test-sbom.json && echo "✅ Valid JSON" || echo "❌ Invalid JSON"
# Check CycloneDX fields
jq '.bomFormat, .specVersion, .components | length' test-sbom.json
# Test Grype scan
grype sbom:./test-sbom.json -o table
# Test with explicit path
grype sbom:./test-sbom.json -o json > vuln-test.json
# Check results
jq '.matches | length' vuln-test.json
\`\`\`
#### 2. Test Empty/Invalid SBOM Handling
\`\`\`bash
# Test with empty file
touch empty.json
grype sbom:./empty.json 2>&1 | grep -i "format"
# Test with invalid JSON
echo "{invalid json" > invalid.json
grype sbom:./invalid.json 2>&1 | grep -i "format"
# Test with missing fields
echo '{"bomFormat":"test"}' > incomplete.json
grype sbom:./incomplete.json 2>&1 | grep -i "format"
\`\`\`
#### 3. Test Image Availability Check
\`\`\`bash
# Test manifest check for existing image
docker manifest inspect ghcr.io/wikid82/charon:latest
# Test manifest check for non-existent image
docker manifest inspect ghcr.io/wikid82/charon:pr-99999 2>&1
\`\`\`
### Post-Deployment Validation
#### Test Scenarios
1. **Existing Image (Success Path)**
- Use branch with recent merge to `main`
- Trigger workflow manually
- Expected: SBOM generated, validated, scanned successfully
2. **PR Without Image (Skip Path)**
- Create test PR
- Expected: Image check fails gracefully, scan skipped, clear message
3. **Image with Vulnerabilities**
- Use older image tag (if available)
- Expected: Vulnerabilities detected and reported
### Success Criteria
- [ ] No "sbom format not recognized" errors
- [ ] SBOM validation catches empty files
- [ ] SBOM validation catches invalid JSON
- [ ] SBOM validation catches missing CycloneDX fields
- [ ] Grype successfully scans valid SBOMs
- [ ] Clear skip messages when image doesn't exist
- [ ] PR comments show accurate status
- [ ] Workflow logs are clear and actionable
- [ ] No false positives or false negatives
---
## Rollback Plan
### If Issues Persist
1. **Immediate Rollback**
\`\`\`bash
git revert <commit-hash>
git push origin main
\`\`\`
2. **Temporary Disable**
- Add `if: false` to the vulnerability scan step
- Comment in PR explaining temporary measure
3. **Alternative: Pin Tool Versions**
If the issue is version-related:
\`\`\`yaml
# Pin Syft version
curl -sSfL <https://raw.githubusercontent.com/anchore/syft/main/install.sh> | sh -s -- -b /usr/local/bin v0.100.0
# Pin Grype version
curl -sSfL <https://raw.githubusercontent.com/anchore/grype/main/install.sh> | sh -s -- -b /usr/local/bin v0.74.0
\`\`\`
### Investigation Steps
1. Collect workflow logs from failed run
2. Download generated SBOM artifact (if saved)
3. Test locally with same tool versions
4. Check Grype/Syft GitHub issues for known bugs
5. Verify image registry permissions
---
## Dependencies and Prerequisites
### Tool Versions
- **Syft**: Latest from install script (currently v0.100+)
- **Grype**: Latest from install script (currently v0.74+)
- **Docker**: v20+ (available in GitHub runners)
- **jq**: v1.6+ (available in GitHub runners)
### GitHub Permissions Required
- `contents: read` - Repository code access
- `packages: read` - Container registry access
- `pull-requests: write` - Comment on PRs
- `security-events: write` - Upload scan results (for SARIF)
- `id-token: write` - OIDC token (for attestations)
- `attestations: write` - Create/verify attestations
### External Dependencies
- GitHub Container Registry (ghcr.io) must be accessible
- Anchore install scripts must be available
- Internet access required for tool installation
---
## Implementation Checklist
### Preparation
- [ ] Review current workflow file
- [ ] Document current behavior
- [ ] Create feature branch
### Implementation
- [ ] Add image existence check step
- [ ] Change SBOM format from SPDX to CycloneDX
- [ ] Add SBOM validation step
- [ ] Update vulnerability scan step with better error handling
- [ ] Add skip report step
- [ ] Update PR comment logic
- [ ] Update workflow documentation
### Testing
- [ ] Test locally with existing image
- [ ] Test with empty SBOM file
- [ ] Test with invalid JSON
- [ ] Create test PR
- [ ] Trigger workflow on test PR
- [ ] Verify skip behavior
- [ ] Merge to main (or test branch)
- [ ] Verify success path
### Documentation
- [ ] Update README if needed
- [ ] Document SBOM format choice
- [ ] Add troubleshooting guide
- [ ] Update CI/CD documentation
### Deployment
- [ ] Create PR with changes
- [ ] Code review
- [ ] Merge to main
- [ ] Monitor first runs
- [ ] Address any issues
---
## Timeline
| Phase | Tasks | Duration | Status |
|-------|-------|----------|--------|
| **Preparation** | Review, document, branch | 30 min | Pending |
| **Implementation** | Code changes | 1-2 hours | Pending |
| **Testing** | Local and CI testing | 1-2 hours | Pending |
| **Documentation** | Update docs | 30 min | Pending |
| **Review & Merge** | PR review, merge | 1 hour | Pending |
| **Monitoring** | Watch first runs | 1-2 hours | Pending |
**Total Estimated Time**: 5-8 hours (can be split over 1-2 days)
---
## Risk Assessment
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| Format still not recognized | Low | High | Extensive local testing first |
| SBOM validation too strict | Medium | Medium | Start with lenient validation, tighten gradually |
| Performance degradation | Low | Low | Validation is lightweight (< 5 seconds) |
| Breaking existing workflows | Low | High | Thorough testing, monitor first runs |
| Tool version incompatibility | Low | Medium | Document versions, can pin if needed |
| Missed edge cases | Medium | Medium | Comprehensive test scenarios, monitor logs |
**Overall Risk Level**: **Medium-Low** - Well-understood problem with proven solution
---
## Success Metrics
### Technical Metrics
- Workflow success rate: 100% on valid images
- SBOM validation accuracy: 100%
- Grype scan completion rate: 100% on valid SBOMs
- False positive rate: < 1%
- False negative rate: 0%
### Operational Metrics
- Time to detect vulnerability: < 5 minutes after image build
- Mean time to remediate issues: Immediate (next workflow run)
- Manual intervention required: 0
- CI/CD pipeline reliability: > 99%
### Quality Metrics
- Zero "format not recognized" errors in 30 days
- Clear, actionable error messages
- Comprehensive workflow logs
- Developer satisfaction with error feedback
---
## Future Enhancements (Phase 2)
### Reuse Attested SBOM
Instead of regenerating SBOM, retrieve the one created by docker-build:
\`\`\`yaml
- name: Retrieve Attested SBOM
if: steps.image-check.outputs.exists == 'true'
env:
IMAGE: ghcr.io/${{ github.repository_owner }}/charon:${{ steps.tag.outputs.tag }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
echo "Retrieving attested SBOM from registry..."
# Download attestation using GitHub CLI
gh attestation verify oci://${IMAGE} \
--owner ${{ github.repository_owner }} \
--format json > attestation.json 2>&1 || {
echo "⚠️ No attestation found, falling back to generation"
exit 0
}
# Extract SBOM from attestation
jq -r '.predicate' attestation.json > sbom-attested.json
# Validate and use
if jq empty sbom-attested.json 2>/dev/null; then
echo "✅ Retrieved attested SBOM"
mv sbom-attested.json sbom-generated.json
else
echo "⚠️ Invalid attested SBOM, regenerating"
fi
\`\`\`
**Benefits**:
- Single source of truth
- Eliminates duplication
- Uses verified, signed SBOM
- Aligns with supply chain best practices
**Requirements**:
- GitHub CLI with attestation support
- Attestation must be published to registry
- Additional testing for attestation retrieval
---
## Related Documentation
### Internal References
- [.github/workflows/supply-chain-verify.yml](.github/workflows/supply-chain-verify.yml)
- [.github/workflows/docker-build.yml](.github/workflows/docker-build.yml)
- Project README (Security section)
### External References
- [Anchore Grype Documentation](https://github.com/anchore/grype)
- [Anchore Syft Documentation](https://github.com/anchore/syft)
- [CycloneDX Specification](https://cyclonedx.org/specification/overview/)
- [SPDX Specification](https://spdx.dev/specifications/)
- [GitHub Artifact Attestations](https://docs.github.com/en/actions/security-guides/using-artifact-attestations-to-establish-provenance-for-builds)
- [Grype SBOM Scanning Guide](https://github.com/anchore/grype#scan-an-sbom)
- [Syft Output Formats](https://github.com/anchore/syft#output-formats)
---
## Approval and Sign-off
**Plan Created By**: GitHub Copilot AI Assistant
**Date**: 2026-01-10
**Review Status**: Ready for Review
**Required Reviewers**:
- [ ] DevOps Lead / CI/CD Owner
- [ ] Security Team Representative
- [ ] Repository Maintainer
**Approved By**: _Pending_
**Implementation Start Date**: _Pending Approval_
**Target Completion Date**: _Within 1-2 days of approval_
---
## Revision History
| Date | Version | Changes | Author |
|------|---------|---------|--------|
| 2026-01-10 | 1.0 | Initial remediation plan created | GitHub Copilot |
---
## Notes and Observations
### Key Insights
1. **Format Choice**: CycloneDX is more widely adopted and actively developed than SPDX for SBOM use cases. Docker SBOM action defaults to CycloneDX, and most tooling (Grype, Trivy, etc.) has first-class support.
2. **Error Handling Philosophy**: Current workflow uses `exit 0` to avoid blocking CI. This is appropriate for non-critical failures but masks real issues. The new approach:
- Fails fast on real errors (malformed SBOM, Grype failures)
- Gracefully skips when expected (image doesn't exist yet)
- Provides clear feedback in both cases
3. **Timing Consideration**: PR workflows run before images are built. This is by design (run tests before merge). The solution must handle this gracefully without false failures.
4. **Validation Strategy**: Start with basic validation (file exists, valid JSON, has required fields). Can tighten validation over time based on observed failures.
5. **Monitoring Recommendation**: After deployment, monitor workflow runs for 7 days to catch edge cases and adjust validation criteria if needed.
### Known Limitations
1. **Attestation Retrieval**: Phase 2 enhancement requires GitHub CLI with attestation support, which may not be available in all runner environments.
2. **SBOM Completeness**: Current validation only checks for presence of components, not their completeness. Some vulnerabilities might be missed if SBOM is incomplete.
3. **Format Conversion**: If SPDX is required for compliance, can convert CycloneDX → SPDX using Syft after scan.
### Alternative Approaches Considered
1. **Keep SPDX Format**: Could work but less common and CycloneDX alignment is better.
2. **Disable Verification for PRs**: Would work but reduces security posture.
3. **Wait for Image Before Running**: Would work but increases CI time significantly.
4. **Run Verification in docker-build Workflow**: Considered but verification workflow serves as independent check.
**Selected Approach Rationale**: Hybrid approach provides immediate fix (format + validation) while maintaining workflow independence and security coverage.
---
**End of Remediation Plan**
This plan is comprehensive, actionable, and ready for implementation. All changes are scoped, tested, and documented with clear success criteria.

View File

@@ -1,711 +0,0 @@
# Phase 6: User Management UI Implementation
> **Status**: Planning Complete
> **Created**: 2026-01-24
> **Estimated Effort**: L (Large) - Initially estimated 40-60 hours, **revised to 16-22 hours**
> **Priority**: P2 - Feature Completeness
> **Tests Targeted**: 19 skipped tests in `tests/settings/user-management.spec.ts`
> **Dependencies**: Phase 5 (TestDataManager Auth Fix) - Infrastructure complete, blocked by environment config
---
## Executive Summary
### Goals
Complete the User Management frontend to enable 19 currently-skipped Playwright E2E tests. This phase implements missing UI components including status badges with proper color classes, role badges, resend invite action, email validation, enhanced modal accessibility, and fixes a React anti-pattern bug.
### Key Finding
**Most UI components already exist.** After thorough analysis, the work is primarily:
1. Verifying existing functionality (toast test IDs already exist)
2. Implementing resend invite action (backend endpoint missing - needs implementation)
3. Adding email format validation with visible error
4. Fixing React anti-pattern in PermissionsModal
5. Verification and unskipping tests
**Revised Effort**: 16-22 hours (pending backend resend endpoint scope).
**Solution**: Add missing test selectors, implement resend invite, add email validation UI, fix React bugs, and systematically unskip tests as they pass.
### Test Count Reconciliation
The original plan stated 22 tests, but verification shows **19 skipped test declarations**. The discrepancy came from counting 4 conditional `test.skip()` calls inside test bodies (not actual test declarations). See Section 2 for the complete inventory.
---
## 1. Current State Analysis
### What EXISTS (in `UsersPage.tsx`)
The Users page at `frontend/src/pages/UsersPage.tsx` already contains substantial functionality:
| Component | Status | Notes |
|-----------|--------|-------|
| User list table | ✅ Complete | Columns: User, Role, Status, Permissions, Enabled, Actions |
| InviteModal | ✅ Complete | Email, role, permission mode, host selection, URL preview |
| PermissionsModal | ✅ Complete | Edit user permissions, host toggle |
| Role badges | ✅ Complete | Purple for admin, blue for user, rounded styling |
| Status indicators | ✅ Complete | Active (green), Pending (yellow), Expired (red) with icons |
| Enable/Disable toggle | ✅ Complete | Switch component per user |
| Delete button | ✅ Complete | Trash2 icon with confirmation |
| Settings/Permissions button | ✅ Complete | For non-admin users |
| React Query mutations | ✅ Complete | All CRUD operations |
| Copy invite link | ✅ Complete | With clipboard API |
| URL preview for invites | ✅ Complete | Shows invite URL before sending |
### What is PARTIALLY IMPLEMENTED
| Item | Issue | Fix Required |
|------|-------|--------------|
| Status badges | Class names may not match test expectations | Add explicit color classes |
| Modal keyboard nav | Escape key handling may be missing | Add keyboard event handler |
| PermissionsModal state init | **React anti-pattern: useState used like useEffect** | Fix to use useEffect (see Section 3.6) |
### What is MISSING
| Item | Description | Effort |
|------|-------------|--------|
| Email validation UI | Client-side format validation with visible error | 2 hours |
| Resend invite action | Button + API for pending users | 6-10 hours (backend missing) |
| Backend resend endpoint | `POST /api/v1/users/{id}/resend-invite` | See Phase 6.4 |
---
## 2. Test Analysis
### Summary: 19 Skipped Tests
**File**: `tests/settings/user-management.spec.ts`
| # | Test Name | Line | Category | Skip Reason | Status |
|---|-----------|------|----------|-------------|--------|
| 1 | should show user status badges | 70 | User List | Status badges styling | ✅ Verify |
| 2 | should display role badges | 110 | User List | Role badges selectors | ✅ Verify |
| 3 | should show pending invite status | 164 | User List | Complex timing | ⚠️ Complex |
| 4 | should open invite user modal | 217 | Invite | Outdated skip comment | ✅ Verify |
| 5 | should validate email format | 283 | Invite | No client validation | 🔧 Implement |
| 6 | should copy invite link | 442 | Invite | Toast verification | ✅ Verify |
| 7 | should open permissions modal | 494 | Permissions | Settings icon | 🔒 Auth blocked |
| 8 | should update permission mode | 538 | Permissions | Base URL auth | 🔒 Auth blocked |
| 9 | should add permitted hosts | 612 | Permissions | Settings icon | 🔒 Auth blocked |
| 10 | should remove permitted hosts | 669 | Permissions | Settings icon | 🔒 Auth blocked |
| 11 | should save permission changes | 725 | Permissions | Settings icon | 🔒 Auth blocked |
| 12 | should enable/disable user | 781 | Actions | TestDataManager | 🔒 Auth blocked |
| 13 | should change user role | 828 | Actions | Not implemented | ❌ Future |
| 14 | should delete user with confirmation | 848 | Actions | Delete button | 🔒 Auth blocked |
| 15 | should resend invite for pending user | 956 | Actions | Not implemented | 🔧 Implement |
| 16 | should be keyboard navigable | 1014 | A11y | Known flaky | ⚠️ Flaky |
| 17 | should require admin role for access | 1091 | Security | Routing design | By design |
| 18 | should show error for regular user access | 1125 | Security | Routing design | By design |
| 19 | should have proper ARIA labels | 1157 | A11y | ARIA incomplete | ✅ Verify |
### Legend
- ✅ Verify: Likely already works, just needs verification
- 🔧 Fix/Implement: Requires small code change
- 🔒 Auth blocked: Blocked by Phase 5 (TestDataManager)
- ⚠️ Complex/Flaky: Timing or complexity issues
- By design: Intentional skip (routing behavior)
- ❌ Future: Feature not prioritized
### Tests Addressable in Phase 6
**Without Auth Fix** (can implement now): 6 tests
- Test 1: Status badges styling (verify only)
- Test 2: Role badges (verify only)
- Test 4: Open invite modal (verify only - button IS implemented)
- Test 5: Email validation
- Test 6: Copy invite link (verify only - toast test IDs already exist)
- Test 19: ARIA labels (verify only)
**With Resend Invite**: 1 test
- Test 15: Resend invite
**After Phase 5 Auth Fix**: 6 tests
- Tests 7-12, 14: Permission/Action tests
### Detailed Test Requirements
#### Test 1: should show user status badges (Line 70)
**Test code**:
```typescript
const statusCell = page.locator('td').filter({
has: page.locator('span').filter({
hasText: /active|pending.*invite|invite.*expired/i,
}),
});
const activeStatus = page.locator('span').filter({ hasText: /^active$/i });
// Expects class to include 'green', 'text-green-400', or 'success'
const hasGreenColor = await activeStatus.first().evaluate((el) => {
return el.className.includes('green') ||
el.className.includes('text-green-400') ||
el.className.includes('success');
});
```
**Current code** (UsersPage.tsx line ~459):
```tsx
<span className="inline-flex items-center gap-1 text-green-400 text-xs">
<Check className="h-3 w-3" />
{t('common.active')}
</span>
```
**Analysis**: Current code already includes `text-green-400` class.
**Action**: ✅ **Verify only** - unskip and run test.
#### Test 2: should display role badges (Line 110)
**Test code**:
```typescript
const adminBadge = page.locator('span').filter({ hasText: /^admin$/i });
// Expects 'purple', 'blue', or 'rounded' in class
const hasDistinctColor = await adminBadge.evaluate((el) => {
return el.className.includes('purple') ||
el.className.includes('blue') ||
el.className.includes('rounded');
});
```
**Current code** (UsersPage.tsx line ~445):
```tsx
<span className={`inline-flex items-center px-2 py-1 rounded-full text-xs font-medium ${
user.role === 'admin' ? 'bg-purple-900/30 text-purple-400' : 'bg-blue-900/30 text-blue-400'
}`}>
{user.role}
</span>
```
**Analysis**: ✅ Classes include `rounded` and `purple`/`blue`.
**Action**: ✅ **Verify only** - unskip and run test.
#### Test 4: should open invite user modal (Line 217)
**Test code**:
```typescript
const inviteButton = page.getByRole('button', { name: /invite.*user/i });
await expect(inviteButton).toBeVisible();
await inviteButton.click();
// Verify modal is visible
const modal = page.getByRole('dialog');
await expect(modal).toBeVisible();
```
**Current state**: ✅ Invite button IS implemented in UsersPage.tsx. The skip comment is outdated.
**Action**: ✅ **Verify only** - unskip and run test.
#### Test 5: should validate email format (Line 283)
**Test code**:
```typescript
const sendButton = page.getByRole('button', { name: /send.*invite/i });
const isDisabled = await sendButton.isDisabled();
// OR error message shown
const errorMessage = page.getByText(/invalid.*email|email.*invalid|valid.*email/i);
```
**Current code**: Button disabled when `!email`, but no format validation visible.
**Action**: 🔧 **Implement** - Add email regex validation with error display.
#### Test 6: should copy invite link (Line 442)
**Test code**:
```typescript
const copiedToast = page.locator('[data-testid="toast-success"]').filter({
hasText: /copied|clipboard/i,
});
```
**Current state**: ✅ Toast component already has `data-testid={toast-${toast.type}}` at `Toast.tsx:31`.
**Action**: ✅ **Verify only** - unskip and run test. No code changes needed.
#### Test 15: should resend invite for pending user (Line 956)
**Test code**:
```typescript
const resendButton = page.getByRole('button', { name: /resend/i });
await resendButton.first().click();
await waitForToast(page, /sent|resend/i, { type: 'success' });
```
**Current state**: ❌ Resend action not implemented.
**Action**: 🔧 **Implement** - Add resend button for pending users + API call.
#### Test 19: should have proper ARIA labels (Line 1157)
**Test code**:
```typescript
const inviteButton = page.getByRole('button', { name: /invite.*user/i });
// Checks for accessible name on action buttons
const ariaLabel = await button.getAttribute('aria-label');
const title = await button.getAttribute('title');
const text = await button.textContent();
```
**Current state**:
- Invite button: text content "Invite User" ✅
- Delete button: `aria-label={t('users.deleteUser')}`
- Settings button: `aria-label={t('users.editPermissions')}`
**Action**: ✅ **Verify only** - unskip and run test.
---
## 3. Implementation Phases
### Phase 6.1: Verify Existing Functionality (3 hours)
**Goal**: Confirm tests 1, 2, 4, 6, 19 pass without code changes.
**Tests in Batch**:
- Test 1: should show user status badges
- Test 2: should display role badges
- Test 4: should open invite user modal
- Test 6: should copy invite link (toast test IDs already exist)
- Test 19: should have proper ARIA labels
**Tasks**:
1. Temporarily remove `test.skip` from tests 1, 2, 4, 6, 19
2. Run tests individually
3. Document results
4. Permanently unskip passing tests
**Commands**:
```bash
# Test status badges
npx playwright test tests/settings/user-management.spec.ts \
--grep "should show user status badges" --project=chromium
# Test role badges
npx playwright test tests/settings/user-management.spec.ts \
--grep "should display role badges" --project=chromium
# Test invite modal opens
npx playwright test tests/settings/user-management.spec.ts \
--grep "should open invite user modal" --project=chromium
# Test copy invite link (toast test IDs already exist)
npx playwright test tests/settings/user-management.spec.ts \
--grep "should copy invite link" --project=chromium
# Test ARIA labels
npx playwright test tests/settings/user-management.spec.ts \
--grep "should have proper ARIA labels" --project=chromium
```
**Expected outcome**: 4-5 tests pass immediately.
---
### Phase 6.2: Email Validation UI (2 hours)
**Goal**: Add client-side email format validation with visible error.
**File to modify**: `frontend/src/pages/UsersPage.tsx` (InviteModal)
**Implementation**:
```tsx
// Add state in InviteModal component
const [emailError, setEmailError] = useState<string | null>(null)
// Email validation function
const validateEmail = (email: string): boolean => {
if (!email) {
setEmailError(null)
return false
}
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
if (!emailRegex.test(email)) {
setEmailError(t('users.invalidEmail'))
return false
}
setEmailError(null)
return true
}
// Update email input (replace existing Input)
<div>
<Input
label={t('users.emailAddress')}
type="email"
value={email}
onChange={(e) => {
setEmail(e.target.value)
validateEmail(e.target.value)
}}
placeholder="user@example.com"
aria-invalid={!!emailError}
aria-describedby={emailError ? 'email-error' : undefined}
/>
{emailError && (
<p
id="email-error"
className="mt-1 text-sm text-red-400"
role="alert"
>
{emailError}
</p>
)}
</div>
// Update button disabled logic
disabled={!email || !!emailError}
```
**Translation key to add** (to appropriate i18n file):
```json
{
"users.invalidEmail": "Please enter a valid email address"
}
```
**Validation**:
```bash
npx playwright test tests/settings/user-management.spec.ts \
--grep "should validate email format" --project=chromium
```
**Expected outcome**: Test 5 passes.
---
### Phase 6.3: Resend Invite Action (6-10 hours)
**Goal**: Add resend invite button for pending users.
#### Backend Verification
**REQUIRED**: Check if backend endpoint exists before proceeding:
```bash
grep -r "resend" backend/internal/api/handlers/
grep -r "ResendInvite" backend/internal/api/
```
**Result of verification**: Backend endpoint **does not exist**. Both grep commands return no results.
**Contingency**: If backend is missing (confirmed), effort increases to **8-10 hours** to implement:
- Endpoint: `POST /api/v1/users/{id}/resend-invite`
- Handler: Regenerate token, send email, return new token info
- Tests: Unit tests for the new handler
#### Frontend Implementation
**File**: `frontend/src/api/users.ts`
Add API function:
```typescript
/**
* Resends an invitation email to a pending user.
* @param id - The user ID to resend invite to
* @returns Promise resolving to InviteUserResponse with new token
*/
export const resendInvite = async (id: number): Promise<InviteUserResponse> => {
const response = await client.post<InviteUserResponse>(`/users/${id}/resend-invite`)
return response.data
}
```
**File**: `frontend/src/pages/UsersPage.tsx`
Add mutation:
```tsx
const resendInviteMutation = useMutation({
mutationFn: resendInvite,
onSuccess: (data) => {
queryClient.invalidateQueries({ queryKey: ['users'] })
if (data.email_sent) {
toast.success(t('users.inviteResent'))
} else {
toast.success(t('users.inviteCreatedNoEmail'))
}
},
onError: (error: unknown) => {
const err = error as { response?: { data?: { error?: string } } }
toast.error(err.response?.data?.error || t('users.resendFailed'))
},
})
```
Add button in user row actions:
```tsx
{user.invite_status === 'pending' && (
<button
onClick={() => resendInviteMutation.mutate(user.id)}
className="p-1.5 text-gray-400 hover:text-blue-400 hover:bg-gray-800 rounded"
title={t('users.resendInvite')}
aria-label={t('users.resendInvite')}
disabled={resendInviteMutation.isPending}
>
<Mail className="h-4 w-4" />
</button>
)}
```
**Translation keys**:
```json
{
"users.resendInvite": "Resend Invite",
"users.inviteResent": "Invitation resent successfully",
"users.inviteCreatedNoEmail": "New invite created. Email could not be sent.",
"users.resendFailed": "Failed to resend invitation"
}
```
**Validation**:
```bash
npx playwright test tests/settings/user-management.spec.ts \
--grep "should resend invite" --project=chromium
```
**Expected outcome**: Test 15 passes.
---
### Phase 6.4: Modal Keyboard Navigation (2 hours)
**Goal**: Ensure Escape key closes modals.
**File to modify**: `frontend/src/pages/UsersPage.tsx`
**Implementation** (add to InviteModal and PermissionsModal):
```tsx
// Add useEffect for keyboard handler
useEffect(() => {
const handleKeyDown = (e: KeyboardEvent) => {
if (e.key === 'Escape') {
handleClose()
}
}
if (isOpen) {
document.addEventListener('keydown', handleKeyDown)
return () => document.removeEventListener('keydown', handleKeyDown)
}
}, [isOpen, handleClose])
```
**Note**: Test 16 (keyboard navigation) is marked as **known flaky** and may remain skipped.
---
### Phase 6.5: PermissionsModal useState Bug Fix (1 hour)
**Goal**: Fix React anti-pattern in PermissionsModal.
**File to modify**: `frontend/src/pages/UsersPage.tsx` (line 339)
**Bug**: `useState` is being used like a `useEffect`, which is a React anti-pattern:
```tsx
// WRONG - useState used like an effect (current code at line 339)
useState(() => {
if (user) {
setPermissionMode(user.permission_mode || 'allow_all')
setSelectedHosts(user.permitted_hosts || [])
}
})
```
**Fix**: Replace with proper `useEffect` with dependency:
```tsx
// CORRECT - useEffect with dependency
useEffect(() => {
if (user) {
setPermissionMode(user.permission_mode || 'allow_all')
setSelectedHosts(user.permitted_hosts || [])
}
}, [user])
```
**Why this matters**: The useState initializer only runs once on mount. The current code appears to work incidentally but:
1. Will not update state when `user` prop changes
2. May cause stale data bugs
3. Violates React's data flow principles
**Validation**:
```bash
# Run TypeScript check
cd frontend && npm run typecheck
# Run related permission tests (after Phase 5 auth fix)
npx playwright test tests/settings/user-management.spec.ts \
--grep "permissions" --project=chromium
```
---
## 4. Implementation Order
```
Week 1 (10-14 hours)
├── Phase 6.1: Verify Existing (3h) → Tests 1, 2, 4, 6, 19
├── Phase 6.2: Email Validation (2h) → Test 5
├── Phase 6.3: Resend Invite (6-10h) → Test 15
│ └── Includes backend endpoint implementation
└── Phase 6.5: PermissionsModal Bug Fix (1h) → Stability
Week 2 (2-3 hours)
└── Phase 6.4: Modal Keyboard Nav (2h) → Partial for Test 16
Validation & Cleanup (3 hours)
└── Run full suite, update skip comments
```
---
## 5. Files to Modify
### Priority 1: Required for Test Enablement
| File | Changes |
|------|---------|
| `frontend/src/pages/UsersPage.tsx` | Email validation, resend invite button, keyboard nav, PermissionsModal useState→useEffect fix |
| `frontend/src/api/users.ts` | Add `resendInvite` function |
| `tests/settings/user-management.spec.ts` | Unskip verified tests |
### Priority 2: Backend (REQUIRED - endpoint missing)
| File | Changes |
|------|---------|
| `backend/internal/api/handlers/user_handler.go` | Add resend-invite endpoint |
| `backend/internal/api/routes.go` | Register new route |
| `backend/internal/api/handlers/user_handler_test.go` | Add tests for resend endpoint |
### Priority 3: Translations
| File | Keys to Add |
|------|-------------|
| `frontend/src/i18n/locales/en.json` | `invalidEmail`, `resendInvite`, `inviteResent`, etc. |
### NOT Required (Already Implemented)
| File | Status |
|------|--------|
| `frontend/src/components/Toast.tsx` | ✅ Already has `data-testid={toast-${toast.type}}` |
---
## 6. Validation Strategy
### After Each Phase
```bash
# Run specific tests
npx playwright test tests/settings/user-management.spec.ts \
--grep "<test name>" --project=chromium
```
### Final Validation
```bash
# Run all user management tests
npx playwright test tests/settings/user-management.spec.ts --project=chromium
# Expected:
# - ~12-14 tests passing (up from ~5)
# - ~6-8 tests still skipped (auth blocked or by design)
```
### Test Coverage
```bash
cd frontend && npm run test:coverage
# Verify UsersPage.tsx >= 85%
```
---
## 7. Expected Outcomes
### Tests to Unskip After Phase 6
| Test | Expected Outcome |
|------|------------------|
| should show user status badges | ✅ Pass |
| should display role badges | ✅ Pass |
| should open invite user modal | ✅ Pass |
| should validate email format | ✅ Pass |
| should copy invite link | ✅ Pass |
| should resend invite for pending user | ✅ Pass |
| should have proper ARIA labels | ✅ Pass |
**Total**: 7 tests enabled
### Tests Remaining Skipped
| Test | Reason |
|------|--------|
| Tests 7-12, 14 (7 tests) | 🔒 TestDataManager auth (Phase 5) |
| Test 3: pending invite status | ⚠️ Complex timing |
| Test 13: change user role | ❌ Feature not implemented |
| Test 16: keyboard navigation | ⚠️ Known flaky |
| Tests 17-18: admin access | Routing design (intentional) |
**Total remaining skipped**: ~12 tests (down from 22)
---
## 8. Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Backend resend endpoint missing | **CONFIRMED** | High | Backend implementation included in Phase 6.3 (6-10h) |
| Tests pass locally, fail in CI | Medium | Medium | Run in both environments |
| Translation keys missing | Low | Low | Add to all locale files |
| PermissionsModal bug causes regressions | Low | Medium | Fix early in Phase 6.5 with testing |
---
## 9. Success Metrics
| Metric | Before Phase 6 | After Phase 6 | Target |
|--------|----------------|---------------|--------|
| User management tests passing | ~5 | ~12-14 | 15+ |
| User management tests skipped | 19 | 10-12 | <10 |
| Frontend coverage (UsersPage) | TBD | ≥85% | 85% |
---
## 10. Timeline
| Phase | Effort | Cumulative |
|-------|--------|------------|
| 6.1: Verify Existing (5 tests) | 3h | 3h |
| 6.2: Email Validation | 2h | 5h |
| 6.3: Resend Invite (backend included) | 6-10h | 11-15h |
| 6.4: Modal Keyboard Nav | 2h | 13-17h |
| 6.5: PermissionsModal Bug Fix | 1h | 14-18h |
| Validation & Cleanup | 3h | 17-21h |
| Buffer | 3h | **16-22h** |
**Total**: 16-22 hours (range depends on backend complexity)
---
## Change Log
| Date | Author | Change |
|------|--------|--------|
| 2026-01-24 | Planning Agent | Initial plan created with detailed test analysis |
| 2026-01-24 | Planning Agent | **REVISION**: Applied Supervisor corrections: |
| | | - Toast test IDs already exist (Phase 6.2 removed) |
| | | - Updated test line numbers to actual values (70, 110, 217, 283, 442, 956, 1014, 1157) |
| | | - Added Test 4 and Test 6 to Phase 6.1 verification batch (5 tests total) |
| | | - Added Phase 6.5: PermissionsModal useState bug fix |
| | | - Backend resend endpoint confirmed missing (grep verification) |
| | | - Corrected test count: 19 skipped tests (not 22) |
| | | - Updated effort estimates: 16-22h (was 17h) |

View File

@@ -1,282 +0,0 @@
# Workflow Orchestration Fix: Supply Chain Verification Dependency
**Status**: ✅ Complete
**Date Completed**: 2026-01-11
**Issue**: Workflow Orchestration Fix for Supply Chain Verification
---
## Implementation Summary
Successfully implemented workflow orchestration dependency to ensure supply chain verification runs **after** Docker image build completes. See full documentation in [docs/implementation/WORKFLOW_ORCHESTRATION_FIX.md](../../implementation/WORKFLOW_ORCHESTRATION_FIX.md).
---
## Original Specification
### Problem Statement
The `supply-chain-verify.yml` workflow runs **concurrently** with `docker-build.yml` on PR triggers, causing it to skip verification because the Docker image doesn't exist yet:
```
PR Opened
├─> docker-build.yml starts (builds image)
└─> supply-chain-verify.yml starts (image not found → skips)
```
**Root Cause**: Both workflows trigger independently on the same events with no orchestration dependency ensuring verification runs **after** the build completes.
**Evidence**: From the GitHub Actions run, supply-chain-verify correctly detects image doesn't exist and logs: "⚠️ Image not found - likely not built yet"
### Proposed Solution
**Architecture Decision**: Keep workflows separate with dependency orchestration via `workflow_run` trigger.
**Rationale**:
- **Modularity**: Each workflow has a distinct, cohesive purpose
- **Reusability**: Verification can run on-demand or scheduled independently
- **Maintainability**: Easier to test, debug, and understand individual workflows
- **Flexibility**: Can trigger verification separately without rebuilding images
### Implementation Plan
#### Phase 1: Add `workflow_run` Trigger
Modify `supply-chain-verify.yml` triggers:
**Current**:
```yaml
on:
release:
types: [published]
pull_request:
paths: [...]
schedule:
- cron: '0 0 * * 1'
workflow_dispatch:
```
**Proposed**:
```yaml
on:
release:
types: [published]
workflow_run:
workflows: ["Docker Build, Publish & Test"]
types: [completed]
branches:
- main
- development
- feature/beta-release
schedule:
- cron: '0 0 * * 1'
workflow_dispatch:
```
**Key Changes**:
1. Remove `pull_request` trigger (prevents premature execution)
2. Add `workflow_run` trigger that waits for docker-build workflow
3. Specify branches to match docker-build's branch targets
4. Preserve `workflow_dispatch` for manual verification
5. Preserve `schedule` for weekly security scans
#### Phase 2: Filter by Build Success
Add job-level conditional to ensure we only verify successfully built images:
```yaml
jobs:
verify-sbom:
name: Verify SBOM
runs-on: ubuntu-latest
if: |
(github.event_name != 'schedule' || github.ref == 'refs/heads/main') &&
(github.event_name != 'workflow_run' || github.event.workflow_run.conclusion == 'success')
steps:
# ... existing steps
```
#### Phase 3: Update Tag Determination Logic
Modify the "Determine Image Tag" step to handle `workflow_run` context:
```yaml
- name: Determine Image Tag
id: tag
run: |
if [[ "${{ github.event_name }}" == "release" ]]; then
TAG="${{ github.event.release.tag_name }}"
elif [[ "${{ github.event_name }}" == "workflow_run" ]]; then
# Extract tag from the workflow that triggered us
if [[ "${{ github.event.workflow_run.head_branch }}" == "main" ]]; then
TAG="latest"
elif [[ "${{ github.event.workflow_run.head_branch }}" == "development" ]]; then
TAG="dev"
elif [[ "${{ github.event.workflow_run.head_branch }}" == "feature/beta-release" ]]; then
TAG="beta"
elif [[ "${{ github.event.workflow_run.event }}" == "pull_request" ]]; then
# Extract PR number from workflow_run context
PR_NUMBER=$(jq -r '.pull_requests[0].number' <<< '${{ toJson(github.event.workflow_run) }}')
TAG="pr-${PR_NUMBER}"
else
TAG="sha-$(echo ${{ github.event.workflow_run.head_sha }} | cut -c1-7)"
fi
else
TAG="latest"
fi
echo "tag=${TAG}" >> $GITHUB_OUTPUT
```
#### Phase 4: Update PR Comment Logic
Update the "Comment on PR" step to work with `workflow_run` context:
```yaml
- name: Comment on PR
if: |
github.event_name == 'pull_request' ||
(github.event_name == 'workflow_run' && github.event.workflow_run.event == 'pull_request')
uses: actions/github-script@v7
with:
script: |
// Determine PR number from context
let prNumber;
if (context.eventName === 'pull_request') {
prNumber = context.issue.number;
} else if (context.eventName === 'workflow_run') {
const pullRequests = context.payload.workflow_run.pull_requests;
if (pullRequests && pullRequests.length > 0) {
prNumber = pullRequests[0].number;
}
}
if (!prNumber) {
console.log('No PR number found, skipping comment');
return;
}
// ... rest of existing comment logic
```
### Workflow Execution Flow (After Fix)
**PR Workflow**:
```
PR Opened/Updated
└─> docker-build.yml runs
├─> Builds image: ghcr.io/wikid82/charon:pr-XXX
├─> Pushes to registry
├─> Runs tests
└─> Completes successfully
└─> Triggers supply-chain-verify.yml
├─> Image now exists
├─> Generates SBOM
├─> Scans with Grype
└─> Posts results to PR
```
**Push to Main**:
```
Push to main
└─> docker-build.yml runs
└─> Completes successfully
└─> Triggers supply-chain-verify.yml
└─> Verifies SBOM and signatures
```
### Implementation Checklist
**Changes to `.github/workflows/supply-chain-verify.yml`**:
- [x] Update triggers section (remove pull_request, add workflow_run)
- [x] Add job conditional (check workflow_run.conclusion)
- [x] Update tag determination (handle workflow_run context)
- [x] Update PR comment logic (extract PR number correctly)
**Testing Plan**:
- [ ] Test PR workflow (verify sequential execution and correct tagging)
- [ ] Test push to main (verify 'latest' tag usage)
- [ ] Test manual trigger (verify workflow_dispatch works)
- [ ] Test scheduled run (verify weekly scan works)
- [ ] Test failed build scenario (verify verification doesn't run)
### Benefits
- ✅ Verification always runs AFTER image exists
- ✅ No more false "image not found" skips on PRs
- ✅ Manual verification via workflow_dispatch still works
- ✅ Scheduled weekly scans remain functional
- ✅ Only verifies successfully built images
- ✅ Clear separation of concerns
### Potential Issues & Mitigations
1. **workflow_run Limitations**: Can only chain 3 levels deep
- Mitigation: We're only chaining 2 levels (safe)
2. **Branch Context**: workflow_run runs on default branch context
- Mitigation: Extract correct branch/PR info from workflow_run metadata
3. **Failed Build Silent Skip**: If docker-build fails, verification doesn't run
- Mitigation: This is desired behavior; failed builds shouldn't be verified
4. **Forked PRs**: workflow_run from forks may have limited permissions
- Mitigation: Acceptable due to security constraints; docker-build loads images locally for PRs
### Security Considerations
- `workflow_run` runs with permissions of the target branch (prevents privilege escalation)
- Existing permissions in supply-chain-verify are appropriate (read-only for packages)
- Only runs after successfully built images (trust boundary maintained)
### Success Criteria
- ✅ Supply chain verification runs **after** docker-build completes
- ✅ Verification correctly identifies the built image tag
- ✅ PR comments are posted with actual verification results (not skips)
- ✅ Manual and scheduled triggers continue to work
- ✅ Failed builds do not trigger verification
- ✅ Workflow remains maintainable and modular
---
## Implementation Results
**Status**: ✅ All phases completed successfully
**Changes Made**:
1. ✅ Added `workflow_run` trigger to supply-chain-verify.yml
2. ✅ Removed `pull_request` trigger
3. ✅ Added workflow success filter
4. ✅ Enhanced tag determination logic
5. ✅ Updated PR comment extraction
6. ✅ Added debug logging for validation
**Validation**:
- ✅ Security audit passed (see [qa_report_workflow_orchestration.md](../../reports/qa_report_workflow_orchestration.md))
- ✅ Pre-commit hooks passed
- ✅ YAML syntax validated
- ✅ No breaking changes to other workflows
**Documentation**:
- [Implementation Summary](../../implementation/WORKFLOW_ORCHESTRATION_FIX.md)
- [QA Report](../../reports/qa_report_workflow_orchestration.md)
---
**Archived**: 2026-01-11
**Implementation Time**: ~2 hours
**Next Steps**: Monitor first production workflow_run execution