fix: enhance nightly build workflow with SBOM generation and fallback mechanism

This commit is contained in:
GitHub Actions
2026-02-27 10:16:06 +00:00
parent 7654acc710
commit 5b3e005f2b
3 changed files with 385 additions and 406 deletions

View File

@@ -103,11 +103,12 @@ jobs:
const workflows = [
{ id: 'e2e-tests-split.yml' },
{ id: 'codecov-upload.yml', inputs: { run_backend: 'true', run_frontend: 'true' } },
{ id: 'security-pr.yml' },
{ id: 'supply-chain-verify.yml' },
{ id: 'codeql.yml' },
];
core.info('Skipping security-pr.yml: PR-only workflow intentionally excluded from nightly non-PR dispatch');
for (const workflow of workflows) {
const { data: workflowRuns } = await github.rest.actions.listWorkflowRuns({
owner,
@@ -220,11 +221,63 @@ jobs:
echo "- ${{ env.GHCR_REGISTRY }}/${{ env.IMAGE_NAME }}:nightly@${{ steps.build.outputs.digest }}" >> "$GITHUB_STEP_SUMMARY"
- name: Generate SBOM
id: sbom_primary
continue-on-error: true
uses: anchore/sbom-action@17ae1740179002c89186b61233e0f892c3118b11 # v0.23.0
with:
image: ${{ env.GHCR_REGISTRY }}/${{ env.IMAGE_NAME }}:nightly@${{ steps.build.outputs.digest }}
format: cyclonedx-json
output-file: sbom-nightly.json
syft-version: v1.42.1
- name: Generate SBOM fallback with pinned Syft
if: always()
run: |
set -euo pipefail
if [[ "${{ steps.sbom_primary.outcome }}" == "success" ]] && [[ -s sbom-nightly.json ]] && jq -e . sbom-nightly.json >/dev/null 2>&1; then
echo "Primary SBOM generation succeeded with valid JSON; skipping fallback"
exit 0
fi
echo "Primary SBOM generation failed or produced missing/invalid output; using deterministic Syft fallback"
SYFT_VERSION="v1.42.1"
OS="$(uname -s | tr '[:upper:]' '[:lower:]')"
ARCH="$(uname -m)"
case "$ARCH" in
x86_64) ARCH="amd64" ;;
aarch64|arm64) ARCH="arm64" ;;
*) echo "Unsupported architecture: $ARCH"; exit 1 ;;
esac
TARBALL="syft_${SYFT_VERSION#v}_${OS}_${ARCH}.tar.gz"
BASE_URL="https://github.com/anchore/syft/releases/download/${SYFT_VERSION}"
curl -fsSLo "$TARBALL" "${BASE_URL}/${TARBALL}"
curl -fsSLo checksums.txt "${BASE_URL}/syft_${SYFT_VERSION#v}_checksums.txt"
grep " ${TARBALL}$" checksums.txt > checksum_line.txt
sha256sum -c checksum_line.txt
tar -xzf "$TARBALL" syft
chmod +x syft
./syft "${{ env.GHCR_REGISTRY }}/${{ env.IMAGE_NAME }}:nightly@${{ steps.build.outputs.digest }}" -o cyclonedx-json=sbom-nightly.json
- name: Verify SBOM artifact
if: always()
run: |
set -euo pipefail
test -s sbom-nightly.json
jq -e . sbom-nightly.json >/dev/null
jq -e '
.bomFormat == "CycloneDX"
and (.specVersion | type == "string" and length > 0)
and has("version")
and has("metadata")
and (.components | type == "array")
' sbom-nightly.json >/dev/null
- name: Upload SBOM artifact
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0

View File

@@ -1,332 +1,308 @@
# Security Scan (PR) Deterministic Artifact Policy - Supervisor Remediation Plan
## 1. Introduction
### Overview
`Security Scan (PR)` failed because `.github/workflows/security-pr.yml` loaded
an artifact image tag (`pr-718-385081f`) and later attempted extraction with a
different synthesized tag (`pr-718`).
`Nightly Build & Package` currently has two active workflow failures that must
be fixed together in one minimal-scope PR:
Supervisor conflict resolution in this plan selects Option A:
`workflow_run` artifact handling is restricted to upstream
`pull_request` events only.
1. SBOM generation failure in `Generate SBOM` (Syft fetch/version resolution).
2. Dispatch failure from nightly workflow with `Missing required input
'pr_number' not provided`.
### Root-Cause Clarity (Preserved)
The failure was not a Docker load failure. It was a source-of-truth violation in
image selection:
1. Artifact load path succeeded.
2. Extraction path reconstructed an alternate reference.
3. Alternate reference did not exist, causing `docker create ... not found`.
This plan keeps scope strictly on `.github/workflows/security-pr.yml`.
This plan hard-locks runtime code changes to
`.github/workflows/nightly-build.yml` only.
### Objectives
1. Remove all ambiguous behavior for artifact absence on `workflow_run`.
2. Remove `workflow_run` support for upstream `push` events to align with PR
artifact naming contract (`pr-image-<pr_number>`).
3. Codify one deterministic `workflow_dispatch` policy in SHALL form.
4. Harden image selection so it is not brittle on `RepoTags[0]`.
5. Add CI security hardening requirements for permissions and trust boundary.
6. Expand validation matrix to include `pull_request` and negative paths.
---
1. Restore deterministic nightly SBOM generation.
2. Enforce strict default-deny dispatch behavior for non-PR nightly events
(`schedule`, `workflow_dispatch`).
3. Preserve GitHub Actions best practices: pinned SHAs, least privilege, and
deterministic behavior.
4. Keep both current failures in a single scope and do not pivot to unrelated fixes.
5. Remove `security-pr.yml` from nightly dispatch list unless a hard
requirement is proven.
## 2. Research Findings
### 2.1 Failure Evidence
### 2.1 Primary Workflow Scope
Source: `.github/logs/ci_failure.log`
File analyzed: `.github/workflows/nightly-build.yml`
Observed facts:
Relevant areas:
1. Artifact `pr-image-718` was found and downloaded from run `22164807859`.
2. `docker load` reported: `Loaded image: ghcr.io/wikid82/charon:pr-718-385081f`.
3. Extraction attempted: `docker create ghcr.io/wikid82/charon:pr-718`.
4. Docker reported: `... pr-718: not found`.
1. Job `build-and-push-nightly`, step `Generate SBOM` uses
`anchore/sbom-action@17ae1740179002c89186b61233e0f892c3118b11`.
2. Job `trigger-nightly-validation` dispatches downstream workflows using
`actions/github-script` and currently includes `security-pr.yml`.
### 2.2 Producer Contract
### 2.2 Root Cause: Missing `pr_number`
Source: `.github/workflows/docker-build.yml`
Directly related called workflow:
Producer emits immutable PR tags with SHA suffix (`pr-<num>-<sha>`). Consumer
must consume artifact metadata/load output, not reconstruct mutable tags.
1. `.github/workflows/security-pr.yml`
2. Trigger contract includes:
- `workflow_dispatch.inputs.pr_number.required: true`
### 2.3 Current Consumer Gaps
Impact:
Source: `.github/workflows/security-pr.yml`
1. Nightly dispatcher invokes `createWorkflowDispatch` for `security-pr.yml`
without `pr_number`.
2. For nightly non-PR contexts (scheduled/manual nightly), there is no natural
PR number, so dispatch fails by contract.
3. PR lookup by nightly head SHA is not a valid safety mechanism for nightly
non-PR trigger types and must not be relied on for `schedule` or
`workflow_dispatch`.
Current consumer contains ambiguous policy points:
### 2.3 Decision: Remove PR-Only Workflow from Nightly Dispatch List
1. `workflow_run` artifact absence behavior can be interpreted as skip or fail.
2. `workflow_dispatch` policy is not single-path deterministic.
3. Image identification relies on single `RepoTags[0]` assumption.
4. Trust boundary and permission minimization are not explicitly codified as
requirements.
Assessment result:
---
1. No hard requirement was found that requires nightly workflow to dispatch
`security-pr.yml`.
2. `security-pr.yml` is contractually PR/manual-oriented because it requires
`pr_number`.
3. Keeping it in nightly fan-out adds avoidable failure risk and encourages
invalid context synthesis.
## 3. Technical Specifications
Decision:
### 3.1 Deterministic EARS Requirements (Blocking)
1. Remove `security-pr.yml` from nightly dispatch list.
2. Keep strict default-deny guard logic to prevent accidental future dispatch
from non-PR events.
1. WHEN `security-pr.yml` is triggered by `workflow_run` with
`conclusion == success` and upstream event `pull_request`, THE SYSTEM SHALL
require the expected image artifact to exist and SHALL hard fail the job if
the artifact is missing.
Risk reduction from removal:
2. WHEN `security-pr.yml` is triggered by `workflow_run` and artifact lookup
fails, THEN THE SYSTEM SHALL exit non-zero with a diagnostic that includes:
upstream run id, expected artifact name, and reason category (`not found` or
`api/error`).
1. Eliminates `pr_number` contract mismatch in nightly non-PR events.
2. Removes a class of false failures from nightly reliability metrics.
3. Simplifies dispatcher logic and review surface.
3. WHEN `security-pr.yml` is triggered by `workflow_run` and upstream event is
not `pull_request`, THEN THE SYSTEM SHALL hard fail immediately with reason
category `unsupported_upstream_event` and SHALL NOT attempt artifact lookup,
image load, or extraction.
### 2.4 Root Cause: SBOM/Syft Fetch Failure
4. WHEN `security-pr.yml` is triggered by `workflow_dispatch`, THE SYSTEM SHALL
require `inputs.pr_number` and SHALL hard fail immediately if input is empty.
Observed behavior indicates Syft retrieval/version resolution instability during
the SBOM step. In current workflow, no explicit `syft-version` is set in
`nightly-build.yml`, so resolution is not explicitly pinned at the workflow
layer.
5. WHEN `security-pr.yml` is triggered by `workflow_dispatch` with valid
`inputs.pr_number`, THE SYSTEM SHALL resolve artifact `pr-image-<pr_number>`
from the latest successful `docker-build.yml` run for that PR and SHALL hard
fail if artifact resolution or download fails.
### 2.5 Constraints and Policy Alignment
6. WHEN artifact image is loaded, THE SYSTEM SHALL derive a canonical local
image alias (`charon:artifact`) from validated load result and SHALL use only
that alias for `docker create` in artifact-based paths.
1. Keep action SHAs pinned.
2. Keep permission scopes unchanged unless required.
3. Keep change minimal and limited to nightly workflow path only.
7. WHEN artifact metadata parsing is required, THE SYSTEM SHALL NOT depend only
on `RepoTags[0]`; it SHALL validate all available repo tags and SHALL support
fallback selection using docker load image ID when tags are absent/corrupt.
## 3. Technical Specification (EARS)
8. IF no valid tag and no valid load image ID can be resolved, THEN THE SYSTEM
SHALL hard fail before extraction.
1. WHEN nightly runs from `schedule` or `workflow_dispatch`, THE SYSTEM SHALL
enforce strict default-deny for PR-only dispatches.
9. WHEN event is `pull_request` or `push`, THE SYSTEM SHALL build and use
`charon:local` only and SHALL NOT execute artifact lookup/load logic.
2. WHEN nightly runs from `schedule` or `workflow_dispatch`, THE SYSTEM SHALL
NOT perform PR-number lookup from nightly head SHA.
### 3.2 Deterministic Policy Decisions
3. WHEN evaluating downstream nightly dispatches, THE SYSTEM SHALL exclude
`security-pr.yml` from nightly dispatch targets unless a hard requirement
is explicitly introduced and documented.
#### Policy A: `workflow_run` Missing Artifact
4. IF `security-pr.yml` is reintroduced in the future, THEN THE SYSTEM SHALL
dispatch it ONLY when a real PR context includes a concrete `pr_number`,
and SHALL deny by default in all other contexts.
Decision: hard fail only.
5. WHEN `Generate SBOM` runs in nightly, THE SYSTEM SHALL use a deterministic
two-stage strategy in the same PR scope:
- Primary path: `syft-version: v1.42.1` via `anchore/sbom-action`
- In-PR fallback path: explicit Syft CLI installation/generation
with pinned version/checksum and hard verification
No skip behavior is allowed for upstream-success `workflow_run`.
6. IF primary SBOM generation fails or does not produce a valid file, THEN THE
SYSTEM SHALL execute fallback generation and SHALL fail the job when fallback
also fails or output validation fails.
#### Policy A1: `workflow_run` Upstream Event Contract
7. THE SYSTEM SHALL keep GitHub Actions pinned to immutable SHAs and SHALL NOT
broaden token permissions for this fix.
Decision: upstream event MUST be `pull_request`.
## 4. Exact Implementation Edits
If upstream event is `push` or any non-PR event, fail immediately with
`unsupported_upstream_event`; no artifact path execution is allowed.
### 4.1 `.github/workflows/nightly-build.yml`
#### Policy B: `workflow_dispatch`
### Edit A: Harden downstream dispatch for non-PR triggers
Decision: artifact-only manual replay.
Location: job `trigger-nightly-validation`, step
`Dispatch Missing Nightly Validation Workflows`.
No local-build fallback is allowed for `workflow_dispatch`. Required input is
`pr_number`; missing input is immediate hard fail.
Exact change intent:
### 3.3 Image Selection Hardening Contract
1. Remove `security-pr.yml` from the nightly dispatch list.
2. Keep dispatch for `e2e-tests-split.yml`, `codecov-upload.yml`,
`supply-chain-verify.yml`, and `codeql.yml` unchanged.
3. Add explicit guard comments and logging stating non-PR nightly events are
default-deny for PR-only workflows.
4. Explicitly prohibit PR number synthesis and prohibit PR lookup from nightly
SHA for `schedule` and `workflow_dispatch`.
For step `Load Docker image` in `.github/workflows/security-pr.yml`:
Implementation shape (script-level):
1. Validate artifact file exists and is readable tar.
2. Parse `manifest.json` and iterate all candidate tags under `RepoTags[]`.
3. Run `docker load` and capture structured output.
4. Resolve source image by deterministic priority:
- First valid tag from `RepoTags[]` that exists locally after load.
- Else image ID extracted from `docker load` output (if present).
- Else fail.
5. Retag resolved source to `charon:artifact`.
6. Emit outputs:
- `image_ref=charon:artifact`
- `source_image_ref=<resolved tag or image id>`
- `source_resolution_mode=manifest_tag|load_image_id`
1. Keep workflow list explicit.
2. Keep a local denylist/set for PR-only workflows and ensure they are never
dispatched from nightly non-PR events.
3. No PR-number inputs are synthesized from nightly SHA or non-PR context.
4. No PR lookup calls are executed for nightly non-PR events.
### 3.4 CI Security Hardening Requirements
### Edit B: Stabilize Syft source in `Generate SBOM`
For job `security-scan` in `.github/workflows/security-pr.yml`:
Location: job `build-and-push-nightly`, step `Generate SBOM`.
1. THE SYSTEM SHALL enforce least-privilege permissions by default:
- `contents: read`
- `actions: read`
- `security-events: write`
- No additional write scopes unless explicitly required.
Exact change intent:
2. THE SYSTEM SHALL restrict `pull-requests: write` usage to only steps that
require PR annotations/comments. If no such step exists, this permission
SHALL be removed.
1. Keep existing pinned `anchore/sbom-action` SHA unless evidence shows that SHA
itself is the failure source.
2. Add explicit `syft-version: v1.42.1` in `with:` block as the primary pin.
3. Set the primary SBOM step to `continue-on-error: true` to allow deterministic
in-PR fallback execution.
4. Add fallback step gated on primary step failure OR missing/invalid output:
- Install Syft CLI `v1.42.1` from official release with checksum validation.
- Generate `sbom-nightly.json` via CLI.
5. Add mandatory verification step (no `continue-on-error`) with explicit
pass/fail criteria:
- `sbom-nightly.json` exists.
- file size is greater than 0 bytes.
- JSON parses successfully (`jq empty`).
- expected top-level fields exist for selected format.
6. If verification fails, job fails. SBOM cannot pass silently without
generated artifact.
3. THE SYSTEM SHALL enforce workflow_run trust boundary guards:
- Upstream workflow name must match expected producer.
- Upstream conclusion must be `success`.
- Upstream event must be `pull_request` only.
- Upstream head repository must equal `${{ github.repository }}` (same-repo
trust boundary), otherwise hard fail.
### 4.2 Scope Lock
4. THE SYSTEM SHALL NOT use untrusted `workflow_run` payload values to build
shell commands without validation and quoting.
1. No edits to `.github/workflows/security-pr.yml` in this plan.
2. Contract remains unchanged: `workflow_dispatch.inputs.pr_number.required: true`.
### 3.5 Step-Level Scope in `security-pr.yml`
## 5. Reconfirmation: Non-Target Files
Targeted steps:
No changes required:
1. `Extract PR number from workflow_run`
2. `Validate workflow_run upstream event contract`
3. `Check for PR image artifact`
4. `Skip if no artifact` (to be converted to deterministic fail paths for
`workflow_run` and `workflow_dispatch`)
5. `Load Docker image`
6. `Extract charon binary from container`
1. `.gitignore`
2. `codecov.yml`
3. `.dockerignore`
4. `Dockerfile`
### 3.6 Event Data Flow (Deterministic)
Rationale:
```text
pull_request/push
-> Build Docker image (Local)
-> image_ref=charon:local
-> Extract /app/charon
-> Trivy scan
1. Both failures are workflow orchestration issues, not source-ignore, coverage
policy, Docker context, or image build recipe issues.
workflow_run (upstream success only)
-> Assert upstream event == pull_request (hard fail if false)
-> Require artifact exists (hard fail if missing)
-> Load/validate image
-> image_ref=charon:artifact
-> Extract /app/charon
-> Trivy scan
workflow_dispatch
-> Require pr_number input (hard fail if missing)
-> Resolve pr-image-<pr_number> artifact (hard fail if missing)
-> Load/validate image
-> image_ref=charon:artifact
-> Extract /app/charon
-> Trivy scan
```
### 3.7 Error Handling Matrix
| Step | Condition | Required Behavior |
|---|---|---|
| Validate workflow_run upstream event contract | `workflow_run` upstream event is not `pull_request` | Hard fail with `unsupported_upstream_event`; stop before artifact lookup |
| Check for PR image artifact | `workflow_run` upstream success but artifact missing | Hard fail with run id + artifact name |
| Extract PR number from workflow_run | `workflow_dispatch` and empty `inputs.pr_number` | Hard fail with input requirement message |
| Load Docker image | Missing/corrupt `charon-pr-image.tar` | Hard fail before `docker load` |
| Load Docker image | Missing/corrupt `manifest.json` | Attempt load-image-id fallback; fail if unresolved |
| Load Docker image | No valid `RepoTags[]` and no load image id | Hard fail |
| Extract charon binary from container | Empty/invalid `image_ref` | Hard fail before `docker create` |
| Extract charon binary from container | `/app/charon` missing | Hard fail with chosen image reference |
### 3.8 API/DB Changes
No backend API, frontend, or database schema changes.
---
## 4. Implementation Plan
### Phase 1: Playwright Impact Check
1. Mark Playwright scope as N/A because this change is workflow-only.
2. Record N/A rationale in PR description.
### Phase 2: Deterministic Event Policies
File: `.github/workflows/security-pr.yml`
1. Convert ambiguous skip/fail logic to hard-fail policy for
`workflow_run` missing artifact after upstream success.
2. Enforce deterministic `workflow_dispatch` policy:
- Required `pr_number` input.
- Artifact-only replay path.
- No local fallback.
3. Enforce PR-only `workflow_run` event contract:
- Upstream event must be `pull_request`.
- Upstream `push` or any non-PR event hard fails with
`unsupported_upstream_event`.
### Phase 3: Image Selection Hardening
File: `.github/workflows/security-pr.yml`
1. Harden `Load Docker image` with manifest validation and multi-tag handling.
2. Add fallback resolution via docker load image ID.
3. Emit explicit outputs for traceability (`source_resolution_mode`).
4. Ensure extraction consumes only selected alias (`charon:artifact`).
### Phase 4: CI Security Hardening
File: `.github/workflows/security-pr.yml`
1. Reduce job permissions to least privilege.
2. Remove/conditionalize `pull-requests: write` if not required.
3. Add workflow_run trust-boundary guard conditions and explicit fail messages.
### Phase 5: Validation
1. `pre-commit run actionlint --files .github/workflows/security-pr.yml`
2. Simulate deterministic paths (or equivalent CI replay) for all matrix cases.
3. Verify logs show chosen `source_image_ref` and `source_resolution_mode`.
---
## 5. Validation Matrix
| ID | Trigger Path | Scenario | Expected Result |
|---|---|---|---|
| V1 | `workflow_run` | Upstream success + artifact present | Pass, uses `charon:artifact` |
| V2 | `workflow_run` | Upstream success + artifact missing | Hard fail (non-zero) |
| V3 | `workflow_run` | Upstream success + artifact manifest corrupted | Hard fail after validation/fallback attempt |
| V4 | `workflow_run` | Upstream success + upstream event `push` | Hard fail with `unsupported_upstream_event` |
| V5 | `pull_request` | Direct PR trigger | Pass, uses `charon:local`, no artifact lookup |
| V6 | `push` | Direct push trigger | Pass, uses `charon:local`, no artifact lookup |
| V7 | `workflow_dispatch` | Missing `pr_number` input | Hard fail immediately |
| V8 | `workflow_dispatch` | Valid `pr_number` + artifact exists | Pass, uses `charon:artifact` |
| V9 | `workflow_dispatch` | Valid `pr_number` + artifact missing | Hard fail |
| V10 | `workflow_run` | Upstream from untrusted repository context | Hard fail by trust-boundary guard |
---
## 6. Acceptance Criteria
1. Plan states unambiguous hard-fail behavior for missing artifact on
`workflow_run` after upstream `pull_request` success.
2. Plan states `workflow_run` event contract is PR-only and that upstream
`push` is a deterministic hard-fail contract violation.
3. Plan states one deterministic `workflow_dispatch` policy in SHALL terms:
required `pr_number`, artifact-only path, no local fallback.
4. Plan defines robust image resolution beyond `RepoTags[0]`, including
load-image-id fallback and deterministic aliasing.
5. Plan includes least-privilege permissions and explicit workflow_run trust
boundary constraints.
6. Plan includes validation coverage for `pull_request` and direct `push` local
paths plus negative paths: unsupported upstream event, missing dispatch
input, missing artifact, corrupted/missing manifest.
7. Root cause remains explicit: image-reference mismatch inside
`.github/workflows/security-pr.yml` after successful artifact load.
---
## 7. Risks and Mitigations
## 6. Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Overly strict dispatch policy blocks ad-hoc scans | Medium | Document explicit manual replay contract in workflow description |
| PR-only workflow_run contract fails upstream push-triggered runs | Medium | Intentional contract enforcement; document `unsupported_upstream_event` and route push scans through direct push path |
| Manifest parsing edge cases | Medium | Multi-source resolver with load-image-id fallback |
| Permission tightening breaks optional PR annotations | Low | Make PR-write permission step-scoped only if needed |
| Trust-boundary guards reject valid internal events | Medium | Add clear diagnostics and test cases V1/V10 |
| `security-pr.yml` accidentally dispatched in non-PR mode | Low | Remove from nightly dispatch list and enforce default-deny comments/guards |
| Primary Syft acquisition fails (`v1.42.1`) | Medium | Execute deterministic in-PR fallback with pinned checksum and hard output verification |
| SBOM step appears green without real artifact | High | Mandatory verification step with explicit file/JSON checks and hard fail |
| Action SHA update introduces side effects | Medium | Limit SHA change to `Generate SBOM` step only and validate end-to-end nightly path |
| Over-dispatch/under-dispatch in validation job | Low | Preserve existing dispatch logic for all non-PR-dependent workflows |
---
## 7. Rollback Plan
## 8. PR Slicing Strategy
1. Revert runtime behavior changes in
`.github/workflows/nightly-build.yml`:
- `trigger-nightly-validation` dispatch logic
- `Generate SBOM` primary + fallback + verification sequence
2. Re-run nightly dispatch manually to verify previous baseline runtime
behavior.
Rollback scope: runtime workflow behavior only in
`.github/workflows/nightly-build.yml`. Documentation updates are not part of
runtime rollback.
## 8. Validation Plan
### 8.1 Static Validation
```bash
cd /projects/Charon
pre-commit run actionlint --files .github/workflows/nightly-build.yml
```
### 8.2 Behavioral Validation (Nightly non-PR)
```bash
gh workflow run nightly-build.yml --ref nightly -f reason="nightly dual-fix validation" -f skip_tests=true
gh run list --workflow "Nightly Build & Package" --branch nightly --limit 1
gh run view <run-id> --json databaseId,headSha,event,status,conclusion,createdAt
gh run view <run-id> --log
```
Expected outcomes:
1. `Generate SBOM` succeeds through primary path or deterministic fallback and
`sbom-nightly.json` is uploaded.
2. Dispatch step does not attempt `security-pr.yml` from nightly run.
3. No `Missing required input 'pr_number' not provided` error.
4. Both targeted nightly failures are resolved in the same run scope:
`pr_number` dispatch failure and Syft/SBOM failure.
### 8.3 Explicit Negative Dispatch Verification (Run-Scoped/Time-Scoped)
Verify `security-pr.yml` was not dispatched by this specific nightly run using
time scope and actor scope (not SHA-only):
```bash
RUN_JSON=$(gh run view <nightly-run-id> --json databaseId,createdAt,updatedAt,event,headBranch)
START=$(echo "$RUN_JSON" | jq -r '.createdAt')
END=$(echo "$RUN_JSON" | jq -r '.updatedAt')
gh api repos/<owner>/<repo>/actions/workflows/security-pr.yml/runs \
--paginate \
-f event=workflow_dispatch | \
jq --arg start "$START" --arg end "$END" '
[ .workflow_runs[]
| select(.created_at >= $start and .created_at <= $end)
| select(.head_branch == "nightly")
| select(.triggering_actor.login == "github-actions[bot]")
] | length'
```
Expected result: `0`
### 8.4 Positive Validation: Manual `security-pr.yml` Dispatch Still Works
Run a manual dispatch with a valid PR number and verify successful start:
```bash
gh workflow run security-pr.yml --ref <pr-branch> -f pr_number=<valid-pr-number>
gh run list --workflow "Security Scan (PR)" --limit 5 \
--json databaseId,event,status,conclusion,createdAt,headBranch
gh run view <security-pr-run-id> --log
```
Expected results:
1. Workflow is accepted (no missing-input validation errors).
2. Run event is `workflow_dispatch`.
3. Run completes according to existing workflow behavior.
### 8.5 Contract Validation (No Contract Change)
1. `security-pr.yml` contract remains PR/manual specific and unchanged.
2. Nightly non-PR paths do not consume or synthesize `pr_number`.
## 9. Acceptance Criteria
1. `Nightly Build & Package` no longer fails in `Generate SBOM` due to Syft
fetch/version resolution, with deterministic in-PR fallback.
2. Nightly validation dispatch no longer fails with missing required
`pr_number`.
3. For non-PR nightly triggers (`schedule`/`workflow_dispatch`), PR-only
dispatch of `security-pr.yml` is default-deny and not attempted from nightly
dispatch targets.
4. Workflow remains SHA-pinned and permissions are not broadened.
5. Validation evidence includes explicit run-scoped/time-scoped proof that
`security-pr.yml` was not dispatched by the tested nightly run.
6. No changes made to `.gitignore`, `codecov.yml`, `.dockerignore`, or
`Dockerfile`.
7. Manual dispatch of `security-pr.yml` with valid `pr_number` is validated to
still work.
8. SBOM step fails hard when neither primary nor fallback path produces a valid
SBOM artifact.
## 10. PR Slicing Strategy
### Decision
@@ -334,50 +310,47 @@ Single PR.
### Trigger Reasons
1. Change is isolated to one workflow (`security-pr.yml`).
2. Deterministic policy + hardening are tightly coupled and safest together.
3. Split PRs would create temporary policy inconsistency.
1. Changes are tightly coupled inside one workflow path.
2. Shared validation path (nightly run) verifies both fixes together.
3. Rollback safety is high with one-file revert.
### Ordered Slice
### Ordered Slices
#### PR-1: Deterministic Policy and Security Hardening for `security-pr.yml`
#### PR-1: Nightly Dual-Failure Workflow Fix
Scope:
1. Deterministic missing-artifact handling (`workflow_run` hard fail).
2. Deterministic `workflow_dispatch` artifact-only policy.
3. Hardened image resolution and aliasing.
4. Least-privilege + trust-boundary constraints.
5. Validation matrix execution evidence.
1. `.github/workflows/nightly-build.yml` only.
2. SBOM Syft stabilization with explicit tag pin + fallback rule.
3. Remove `security-pr.yml` from nightly dispatch list and enforce strict
default-deny semantics for non-PR nightly events.
Files:
1. `.github/workflows/security-pr.yml`
1. `.github/workflows/nightly-build.yml`
2. `docs/plans/current_spec.md`
Dependencies:
1. `.github/workflows/docker-build.yml` artifact naming contract unchanged.
1. `security-pr.yml` keeps required `workflow_dispatch` `pr_number` contract.
Validation Gates:
Validation gates:
1. actionlint passes.
2. Validation matrix V1-V10 results captured.
3. No regression to `ghcr.io/...:pr-<num> not found` pattern.
1. `actionlint` passes.
2. Nightly manual dispatch run passes both targeted failure points.
3. SBOM artifact upload succeeds through primary path or fallback path.
4. Explicit run-scoped/time-scoped negative check confirms zero
bot-triggered `security-pr.yml` dispatches during the nightly run window.
5. Positive manual dispatch check with valid `pr_number` succeeds.
Rollback / Contingency:
Rollback and contingency:
1. Revert PR-1 if trust-boundary guards block legitimate same-repo runs.
2. Keep hard-fail semantics; adjust guard predicate, not policy.
1. Revert PR-1.
2. If both primary and fallback Syft paths fail, treat as blocking regression
and do not merge until generation criteria pass.
---
## 11. Complexity Estimate
## 9. Handoff
After approval, implementation handoff to Supervisor SHALL include:
1. Exact step-level edits required in `.github/workflows/security-pr.yml`.
2. Proof logs for each failed/pass matrix case.
3. Confirmation that no files outside plan scope were required.
3. Require explicit evidence that artifact path no longer performs GHCR PR tag
reconstruction.
1. Implementation complexity: Low.
2. Validation complexity: Medium (requires workflow run completion).
3. Blast radius: Low (single workflow file, no runtime code changes).

View File

@@ -1,132 +1,85 @@
# QA/Security Audit Report: `security-pr.yml` Workflow Fix
# QA Report: Nightly Workflow Fix Audit
- Date: 2026-02-27
- Auditor: QA Security mode
- Scope: `.github/workflows/security-pr.yml` behavior fix only
- Overall verdict: **PASS (scope-specific)** with one **out-of-scope repository security debt** noted
- Scope:
- `.github/workflows/nightly-build.yml`
1. `pr_number` failure avoidance in nightly dispatch path
2. Deterministic Syft SBOM generation with fallback
- `.github/workflows/security-pr.yml` contract check (`pr_number` required)
## Findings (Ordered by Severity)
### 🟡 IMPORTANT: Repository secret-scan debt exists (not introduced by scoped workflow change)
- Check: `pre-commit run --hook-stage manual gitleaks-tuned-scan --all-files`
- Result: **FAIL** (`135` findings)
- Scope impact: `touches_security_pr = 0` (no findings in `.github/workflows/security-pr.yml`)
- Evidence source: `test-results/security/gitleaks-tuned-precommit.json`
- Why this matters: Existing credential-like content raises background security risk even if unrelated to this workflow fix.
- Recommended remediation:
1. Triage findings by rule/file and classify true positives vs allowed test fixtures.
2. Add justified allowlist entries for confirmed false positives.
3. Remove or rotate any real secrets immediately.
4. Re-run `gitleaks-tuned-scan` until clean/accepted baseline is documented.
### ✅ No blocking findings in audited scope
### ✅ No blocking defects found in the implemented workflow fix
- Deterministic event handling: validated in workflow logic.
- Artifact/image resolution hardening: validated in workflow logic.
- Security hardening: validated in workflow logic and lint gates.
1. `actionlint` validation passed for modified workflow.
- Command: `actionlint .github/workflows/nightly-build.yml`
- Result: PASS (no diagnostics)
## Requested Validations
2. `pr_number` nightly dispatch failure path is avoided by excluding PR-only workflow from nightly fan-out.
- `security-pr.yml` removed from dispatch list in `.github/workflows/nightly-build.yml:103`
- Explicit log note added at `.github/workflows/nightly-build.yml:110`
### 1) `actionlint` on security workflow
- Command:
- `pre-commit run actionlint --files .github/workflows/security-pr.yml`
- Result: **PASS**
- Key output:
- `actionlint (GitHub Actions)..............................................Passed`
3. SBOM generation is now deterministic with explicit primary pin and verified fallback.
- Primary action pins Syft version at `.github/workflows/nightly-build.yml:231`
- Fallback installs pinned `v1.42.1` with checksum verification at `.github/workflows/nightly-build.yml:245`
- Mandatory artifact verification added at `.github/workflows/nightly-build.yml:268`
### 2) `pre-commit run --all-files`
- Command:
- `pre-commit run --all-files`
- Result: **PASS**
- Key output:
- YAML/shell/actionlint/dockerfile/go vet/golangci-lint/version/LFS/type-check/frontend lint hooks passed.
4. No permission broadening in modified sections.
- Dispatch job permissions remain `actions: write`, `contents: read` at `.github/workflows/nightly-build.yml:84`
- Build job permissions remain `contents: read`, `packages: write`, `id-token: write` at `.github/workflows/nightly-build.yml:145`
- Diff review confirms no `permissions` changes in the modified hunk.
### 3) Security scans/tasks relevant to workflow change (feasible locally)
- Executed:
1. `pre-commit run --hook-stage manual codeql-parity-check --all-files` -> **PASS**
2. `pre-commit run --hook-stage manual codeql-check-findings --all-files` -> **PASS** (no blocking HIGH/CRITICAL)
3. `pre-commit run --hook-stage manual gitleaks-tuned-scan --all-files` -> **FAIL** (repo baseline debt; not in scoped file)
- Additional QA evidence:
- `bash scripts/local-patch-report.sh` -> artifacts generated:
- `test-results/local-patch-report.md`
- `test-results/local-patch-report.json`
5. Action pinning remains SHA-based in modified sections.
- `actions/github-script` pinned SHA at `.github/workflows/nightly-build.yml:89`
- `anchore/sbom-action` pinned SHA at `.github/workflows/nightly-build.yml:226`
- `actions/upload-artifact` pinned SHA at `.github/workflows/nightly-build.yml:283`
## Workflow Behavior Verification
6. `security-pr.yml` contract still requires `pr_number`.
- `workflow_dispatch.inputs.pr_number.required: true` at `.github/workflows/security-pr.yml:14`
## A) Deterministic event handling
Validated in `.github/workflows/security-pr.yml`:
- Manual dispatch input is required and validated as digits-only:
- `.github/workflows/security-pr.yml:10`
- `.github/workflows/security-pr.yml:14`
- `.github/workflows/security-pr.yml:71`
- `.github/workflows/security-pr.yml:78`
- `workflow_run` path constrained to successful upstream PR runs:
- `.github/workflows/security-pr.yml:31`
- `.github/workflows/security-pr.yml:36`
- `.github/workflows/security-pr.yml:38`
- Explicit trust-boundary contract checks for upstream workflow name/event/repository:
- `.github/workflows/security-pr.yml:127`
- `.github/workflows/security-pr.yml:130`
- `.github/workflows/security-pr.yml:136`
- `.github/workflows/security-pr.yml:143`
## Pass/Fail Decision
Assessment: **PASS** for deterministic triggering and contract enforcement.
- QA Status: **PASS with caveats**
- Reason: All requested static validations pass and the scoped workflow logic changes satisfy the audit requirements.
## B) Artifact and image resolution hardening
Validated in `.github/workflows/security-pr.yml`:
- Artifact is mandatory in `workflow_run`/`workflow_dispatch` artifact path; failures are explicit (`api_error`/`not_found`):
- `.github/workflows/security-pr.yml:159`
- `.github/workflows/security-pr.yml:185`
- `.github/workflows/security-pr.yml:196`
- `.github/workflows/security-pr.yml:214`
- `.github/workflows/security-pr.yml:225`
- Docker image load hardened with:
- tar readability check
- `manifest.json` multi-tag parsing (`RepoTags[]`)
- fallback to `Loaded image ID`
- deterministic alias `charon:artifact`
- `.github/workflows/security-pr.yml:255`
- `.github/workflows/security-pr.yml:261`
- `.github/workflows/security-pr.yml:267`
- `.github/workflows/security-pr.yml:273`
- `.github/workflows/security-pr.yml:282`
- `.github/workflows/security-pr.yml:295`
- `.github/workflows/security-pr.yml:300`
- Extraction consumes resolved alias output rather than reconstructed tag:
- `.github/workflows/security-pr.yml:333`
- `.github/workflows/security-pr.yml:342`
## Residual Risks
Assessment: **PASS** for deterministic artifact/image selection and prior mismatch risk mitigation.
1. Fallback integrity uses checksum file from the same release origin as the tarball.
- Impact: If release origin is compromised, checksum verification alone may not detect tampering.
- Suggested hardening: verify signed release metadata or verify Syft artifact signature (Cosign/GitHub attestations) in fallback path.
## C) Security hardening
Validated in `.github/workflows/security-pr.yml`:
- Least-privilege job permissions:
- `.github/workflows/security-pr.yml:40`
- `.github/workflows/security-pr.yml:41`
- `.github/workflows/security-pr.yml:42`
- `.github/workflows/security-pr.yml:43`
- Pinned action SHAs maintained for checkout/download/upload/CodeQL SARIF upload/Trivy action usage:
- `.github/workflows/security-pr.yml:48`
- `.github/workflows/security-pr.yml:243`
- `.github/workflows/security-pr.yml:365`
- `.github/workflows/security-pr.yml:388`
- `.github/workflows/security-pr.yml:397`
- `.github/workflows/security-pr.yml:408`
2. Runtime behavior is not fully proven by local static checks.
- Impact: Dispatch and SBOM behavior still require a real GitHub Actions run to prove end-to-end execution.
Assessment: **PASS** for workflow-level security hardening within scope.
## Remote Execution Limitation and Manual Verification
## DoD Mapping for Workflow-Only Change
I did not execute remote nightly runs for this exact local diff in this audit. Local `actionlint` and source inspection were performed. To validate end-to-end behavior on GitHub Actions, run:
Executed:
- `actionlint` scoped check: **Yes (PASS)**
- Full pre-commit: **Yes (PASS)**
- Workflow-relevant security manual checks (CodeQL parity/findings, gitleaks): **Yes (2 PASS, 1 FAIL out-of-scope debt)**
- Local patch report artifacts: **Yes (generated)**
```bash
cd /projects/Charon
N/A for this scope:
- Playwright E2E feature validation for app behavior: **N/A** (no app/runtime code changes)
- Backend/frontend unit coverage gates: **N/A** (no backend/frontend source modifications in audited fix)
- GORM check-mode gate: **N/A** (no model/database/GORM changes)
- Trivy app binary/image scan execution for changed runtime artifact: **N/A locally for this audit** (workflow logic audited; no image/runtime code delta in this fix)
# 1) Syntax/lint (already run locally)
actionlint .github/workflows/nightly-build.yml
## Conclusion
The implemented fix in `.github/workflows/security-pr.yml` meets the requested goals for deterministic event handling, robust artifact/image resolution, and workflow security hardening. Required validation commands were executed and passed (`actionlint`, `pre-commit --all-files`), and additional feasible security checks were run. One repository-wide gitleaks debt remains and should be remediated separately from this workflow fix.
# 2) Trigger nightly workflow (manual)
gh workflow run nightly-build.yml --ref nightly -f reason="qa-nightly-audit" -f skip_tests=true
# 3) Inspect latest nightly run
gh run list --workflow "Nightly Build & Package" --branch nightly --limit 1
gh run view <run-id> --log
# 4) Confirm no security-pr dispatch error in nightly logs
# Expectation: no "Missing required input 'pr_number' not provided"
# 5) Confirm security-pr contract still enforced
gh workflow run security-pr.yml --ref nightly
# Expectation: dispatch rejected due to required missing input pr_number
# 6) Positive contract check with explicit pr_number
gh workflow run security-pr.yml --ref nightly -f pr_number=<valid-pr-number>
```
Expected outcomes:
- Nightly run completes dispatch phase without `pr_number` input failure.
- SBOM generation succeeds via primary or fallback path and uploads `sbom-nightly.json`.
- `security-pr.yml` continues enforcing required `pr_number` for manual dispatch.