# Security Scan (PR) Deterministic Artifact Policy - Supervisor Remediation Plan ## 1. Introduction ### Overview `Security Scan (PR)` failed because `.github/workflows/security-pr.yml` loaded an artifact image tag (`pr-718-385081f`) and later attempted extraction with a different synthesized tag (`pr-718`). Supervisor conflict resolution in this plan selects Option A: `workflow_run` artifact handling is restricted to upstream `pull_request` events only. ### Root-Cause Clarity (Preserved) The failure was not a Docker load failure. It was a source-of-truth violation in image selection: 1. Artifact load path succeeded. 2. Extraction path reconstructed an alternate reference. 3. Alternate reference did not exist, causing `docker create ... not found`. This plan keeps scope strictly on `.github/workflows/security-pr.yml`. ### Objectives 1. Remove all ambiguous behavior for artifact absence on `workflow_run`. 2. Remove `workflow_run` support for upstream `push` events to align with PR artifact naming contract (`pr-image-`). 3. Codify one deterministic `workflow_dispatch` policy in SHALL form. 4. Harden image selection so it is not brittle on `RepoTags[0]`. 5. Add CI security hardening requirements for permissions and trust boundary. 6. Expand validation matrix to include `pull_request` and negative paths. --- ## 2. Research Findings ### 2.1 Failure Evidence Source: `.github/logs/ci_failure.log` Observed facts: 1. Artifact `pr-image-718` was found and downloaded from run `22164807859`. 2. `docker load` reported: `Loaded image: ghcr.io/wikid82/charon:pr-718-385081f`. 3. Extraction attempted: `docker create ghcr.io/wikid82/charon:pr-718`. 4. Docker reported: `... pr-718: not found`. ### 2.2 Producer Contract Source: `.github/workflows/docker-build.yml` Producer emits immutable PR tags with SHA suffix (`pr--`). Consumer must consume artifact metadata/load output, not reconstruct mutable tags. ### 2.3 Current Consumer Gaps Source: `.github/workflows/security-pr.yml` Current consumer contains ambiguous policy points: 1. `workflow_run` artifact absence behavior can be interpreted as skip or fail. 2. `workflow_dispatch` policy is not single-path deterministic. 3. Image identification relies on single `RepoTags[0]` assumption. 4. Trust boundary and permission minimization are not explicitly codified as requirements. --- ## 3. Technical Specifications ### 3.1 Deterministic EARS Requirements (Blocking) 1. WHEN `security-pr.yml` is triggered by `workflow_run` with `conclusion == success` and upstream event `pull_request`, THE SYSTEM SHALL require the expected image artifact to exist and SHALL hard fail the job if the artifact is missing. 2. WHEN `security-pr.yml` is triggered by `workflow_run` and artifact lookup fails, THEN THE SYSTEM SHALL exit non-zero with a diagnostic that includes: upstream run id, expected artifact name, and reason category (`not found` or `api/error`). 3. WHEN `security-pr.yml` is triggered by `workflow_run` and upstream event is not `pull_request`, THEN THE SYSTEM SHALL hard fail immediately with reason category `unsupported_upstream_event` and SHALL NOT attempt artifact lookup, image load, or extraction. 4. WHEN `security-pr.yml` is triggered by `workflow_dispatch`, THE SYSTEM SHALL require `inputs.pr_number` and SHALL hard fail immediately if input is empty. 5. WHEN `security-pr.yml` is triggered by `workflow_dispatch` with valid `inputs.pr_number`, THE SYSTEM SHALL resolve artifact `pr-image-` from the latest successful `docker-build.yml` run for that PR and SHALL hard fail if artifact resolution or download fails. 6. WHEN artifact image is loaded, THE SYSTEM SHALL derive a canonical local image alias (`charon:artifact`) from validated load result and SHALL use only that alias for `docker create` in artifact-based paths. 7. WHEN artifact metadata parsing is required, THE SYSTEM SHALL NOT depend only on `RepoTags[0]`; it SHALL validate all available repo tags and SHALL support fallback selection using docker load image ID when tags are absent/corrupt. 8. IF no valid tag and no valid load image ID can be resolved, THEN THE SYSTEM SHALL hard fail before extraction. 9. WHEN event is `pull_request` or `push`, THE SYSTEM SHALL build and use `charon:local` only and SHALL NOT execute artifact lookup/load logic. ### 3.2 Deterministic Policy Decisions #### Policy A: `workflow_run` Missing Artifact Decision: hard fail only. No skip behavior is allowed for upstream-success `workflow_run`. #### Policy A1: `workflow_run` Upstream Event Contract Decision: upstream event MUST be `pull_request`. If upstream event is `push` or any non-PR event, fail immediately with `unsupported_upstream_event`; no artifact path execution is allowed. #### Policy B: `workflow_dispatch` Decision: artifact-only manual replay. No local-build fallback is allowed for `workflow_dispatch`. Required input is `pr_number`; missing input is immediate hard fail. ### 3.3 Image Selection Hardening Contract For step `Load Docker image` in `.github/workflows/security-pr.yml`: 1. Validate artifact file exists and is readable tar. 2. Parse `manifest.json` and iterate all candidate tags under `RepoTags[]`. 3. Run `docker load` and capture structured output. 4. Resolve source image by deterministic priority: - First valid tag from `RepoTags[]` that exists locally after load. - Else image ID extracted from `docker load` output (if present). - Else fail. 5. Retag resolved source to `charon:artifact`. 6. Emit outputs: - `image_ref=charon:artifact` - `source_image_ref=` - `source_resolution_mode=manifest_tag|load_image_id` ### 3.4 CI Security Hardening Requirements For job `security-scan` in `.github/workflows/security-pr.yml`: 1. THE SYSTEM SHALL enforce least-privilege permissions by default: - `contents: read` - `actions: read` - `security-events: write` - No additional write scopes unless explicitly required. 2. THE SYSTEM SHALL restrict `pull-requests: write` usage to only steps that require PR annotations/comments. If no such step exists, this permission SHALL be removed. 3. THE SYSTEM SHALL enforce workflow_run trust boundary guards: - Upstream workflow name must match expected producer. - Upstream conclusion must be `success`. - Upstream event must be `pull_request` only. - Upstream head repository must equal `${{ github.repository }}` (same-repo trust boundary), otherwise hard fail. 4. THE SYSTEM SHALL NOT use untrusted `workflow_run` payload values to build shell commands without validation and quoting. ### 3.5 Step-Level Scope in `security-pr.yml` Targeted steps: 1. `Extract PR number from workflow_run` 2. `Validate workflow_run upstream event contract` 3. `Check for PR image artifact` 4. `Skip if no artifact` (to be converted to deterministic fail paths for `workflow_run` and `workflow_dispatch`) 5. `Load Docker image` 6. `Extract charon binary from container` ### 3.6 Event Data Flow (Deterministic) ```text pull_request/push -> Build Docker image (Local) -> image_ref=charon:local -> Extract /app/charon -> Trivy scan workflow_run (upstream success only) -> Assert upstream event == pull_request (hard fail if false) -> Require artifact exists (hard fail if missing) -> Load/validate image -> image_ref=charon:artifact -> Extract /app/charon -> Trivy scan workflow_dispatch -> Require pr_number input (hard fail if missing) -> Resolve pr-image- artifact (hard fail if missing) -> Load/validate image -> image_ref=charon:artifact -> Extract /app/charon -> Trivy scan ``` ### 3.7 Error Handling Matrix | Step | Condition | Required Behavior | |---|---|---| | Validate workflow_run upstream event contract | `workflow_run` upstream event is not `pull_request` | Hard fail with `unsupported_upstream_event`; stop before artifact lookup | | Check for PR image artifact | `workflow_run` upstream success but artifact missing | Hard fail with run id + artifact name | | Extract PR number from workflow_run | `workflow_dispatch` and empty `inputs.pr_number` | Hard fail with input requirement message | | Load Docker image | Missing/corrupt `charon-pr-image.tar` | Hard fail before `docker load` | | Load Docker image | Missing/corrupt `manifest.json` | Attempt load-image-id fallback; fail if unresolved | | Load Docker image | No valid `RepoTags[]` and no load image id | Hard fail | | Extract charon binary from container | Empty/invalid `image_ref` | Hard fail before `docker create` | | Extract charon binary from container | `/app/charon` missing | Hard fail with chosen image reference | ### 3.8 API/DB Changes No backend API, frontend, or database schema changes. --- ## 4. Implementation Plan ### Phase 1: Playwright Impact Check 1. Mark Playwright scope as N/A because this change is workflow-only. 2. Record N/A rationale in PR description. ### Phase 2: Deterministic Event Policies File: `.github/workflows/security-pr.yml` 1. Convert ambiguous skip/fail logic to hard-fail policy for `workflow_run` missing artifact after upstream success. 2. Enforce deterministic `workflow_dispatch` policy: - Required `pr_number` input. - Artifact-only replay path. - No local fallback. 3. Enforce PR-only `workflow_run` event contract: - Upstream event must be `pull_request`. - Upstream `push` or any non-PR event hard fails with `unsupported_upstream_event`. ### Phase 3: Image Selection Hardening File: `.github/workflows/security-pr.yml` 1. Harden `Load Docker image` with manifest validation and multi-tag handling. 2. Add fallback resolution via docker load image ID. 3. Emit explicit outputs for traceability (`source_resolution_mode`). 4. Ensure extraction consumes only selected alias (`charon:artifact`). ### Phase 4: CI Security Hardening File: `.github/workflows/security-pr.yml` 1. Reduce job permissions to least privilege. 2. Remove/conditionalize `pull-requests: write` if not required. 3. Add workflow_run trust-boundary guard conditions and explicit fail messages. ### Phase 5: Validation 1. `pre-commit run actionlint --files .github/workflows/security-pr.yml` 2. Simulate deterministic paths (or equivalent CI replay) for all matrix cases. 3. Verify logs show chosen `source_image_ref` and `source_resolution_mode`. --- ## 5. Validation Matrix | ID | Trigger Path | Scenario | Expected Result | |---|---|---|---| | V1 | `workflow_run` | Upstream success + artifact present | Pass, uses `charon:artifact` | | V2 | `workflow_run` | Upstream success + artifact missing | Hard fail (non-zero) | | V3 | `workflow_run` | Upstream success + artifact manifest corrupted | Hard fail after validation/fallback attempt | | V4 | `workflow_run` | Upstream success + upstream event `push` | Hard fail with `unsupported_upstream_event` | | V5 | `pull_request` | Direct PR trigger | Pass, uses `charon:local`, no artifact lookup | | V6 | `push` | Direct push trigger | Pass, uses `charon:local`, no artifact lookup | | V7 | `workflow_dispatch` | Missing `pr_number` input | Hard fail immediately | | V8 | `workflow_dispatch` | Valid `pr_number` + artifact exists | Pass, uses `charon:artifact` | | V9 | `workflow_dispatch` | Valid `pr_number` + artifact missing | Hard fail | | V10 | `workflow_run` | Upstream from untrusted repository context | Hard fail by trust-boundary guard | --- ## 6. Acceptance Criteria 1. Plan states unambiguous hard-fail behavior for missing artifact on `workflow_run` after upstream `pull_request` success. 2. Plan states `workflow_run` event contract is PR-only and that upstream `push` is a deterministic hard-fail contract violation. 3. Plan states one deterministic `workflow_dispatch` policy in SHALL terms: required `pr_number`, artifact-only path, no local fallback. 4. Plan defines robust image resolution beyond `RepoTags[0]`, including load-image-id fallback and deterministic aliasing. 5. Plan includes least-privilege permissions and explicit workflow_run trust boundary constraints. 6. Plan includes validation coverage for `pull_request` and direct `push` local paths plus negative paths: unsupported upstream event, missing dispatch input, missing artifact, corrupted/missing manifest. 7. Root cause remains explicit: image-reference mismatch inside `.github/workflows/security-pr.yml` after successful artifact load. --- ## 7. Risks and Mitigations | Risk | Impact | Mitigation | |---|---|---| | Overly strict dispatch policy blocks ad-hoc scans | Medium | Document explicit manual replay contract in workflow description | | PR-only workflow_run contract fails upstream push-triggered runs | Medium | Intentional contract enforcement; document `unsupported_upstream_event` and route push scans through direct push path | | Manifest parsing edge cases | Medium | Multi-source resolver with load-image-id fallback | | Permission tightening breaks optional PR annotations | Low | Make PR-write permission step-scoped only if needed | | Trust-boundary guards reject valid internal events | Medium | Add clear diagnostics and test cases V1/V10 | --- ## 8. PR Slicing Strategy ### Decision Single PR. ### Trigger Reasons 1. Change is isolated to one workflow (`security-pr.yml`). 2. Deterministic policy + hardening are tightly coupled and safest together. 3. Split PRs would create temporary policy inconsistency. ### Ordered Slice #### PR-1: Deterministic Policy and Security Hardening for `security-pr.yml` Scope: 1. Deterministic missing-artifact handling (`workflow_run` hard fail). 2. Deterministic `workflow_dispatch` artifact-only policy. 3. Hardened image resolution and aliasing. 4. Least-privilege + trust-boundary constraints. 5. Validation matrix execution evidence. Files: 1. `.github/workflows/security-pr.yml` 2. `docs/plans/current_spec.md` Dependencies: 1. `.github/workflows/docker-build.yml` artifact naming contract unchanged. Validation Gates: 1. actionlint passes. 2. Validation matrix V1-V10 results captured. 3. No regression to `ghcr.io/...:pr- not found` pattern. Rollback / Contingency: 1. Revert PR-1 if trust-boundary guards block legitimate same-repo runs. 2. Keep hard-fail semantics; adjust guard predicate, not policy. --- ## 9. Handoff After approval, implementation handoff to Supervisor SHALL include: 1. Exact step-level edits required in `.github/workflows/security-pr.yml`. 2. Proof logs for each failed/pass matrix case. 3. Confirmation that no files outside plan scope were required. 3. Require explicit evidence that artifact path no longer performs GHCR PR tag reconstruction.