14 KiB
Security Scan (PR) Deterministic Artifact Policy - Supervisor Remediation Plan
1. Introduction
Overview
Security Scan (PR) failed because .github/workflows/security-pr.yml loaded
an artifact image tag (pr-718-385081f) and later attempted extraction with a
different synthesized tag (pr-718).
Supervisor conflict resolution in this plan selects Option A:
workflow_run artifact handling is restricted to upstream
pull_request events only.
Root-Cause Clarity (Preserved)
The failure was not a Docker load failure. It was a source-of-truth violation in image selection:
- Artifact load path succeeded.
- Extraction path reconstructed an alternate reference.
- Alternate reference did not exist, causing
docker create ... not found.
This plan keeps scope strictly on .github/workflows/security-pr.yml.
Objectives
- Remove all ambiguous behavior for artifact absence on
workflow_run. - Remove
workflow_runsupport for upstreampushevents to align with PR artifact naming contract (pr-image-<pr_number>). - Codify one deterministic
workflow_dispatchpolicy in SHALL form. - Harden image selection so it is not brittle on
RepoTags[0]. - Add CI security hardening requirements for permissions and trust boundary.
- Expand validation matrix to include
pull_requestand negative paths.
2. Research Findings
2.1 Failure Evidence
Source: .github/logs/ci_failure.log
Observed facts:
- Artifact
pr-image-718was found and downloaded from run22164807859. docker loadreported:Loaded image: ghcr.io/wikid82/charon:pr-718-385081f.- Extraction attempted:
docker create ghcr.io/wikid82/charon:pr-718. - Docker reported:
... pr-718: not found.
2.2 Producer Contract
Source: .github/workflows/docker-build.yml
Producer emits immutable PR tags with SHA suffix (pr-<num>-<sha>). Consumer
must consume artifact metadata/load output, not reconstruct mutable tags.
2.3 Current Consumer Gaps
Source: .github/workflows/security-pr.yml
Current consumer contains ambiguous policy points:
workflow_runartifact absence behavior can be interpreted as skip or fail.workflow_dispatchpolicy is not single-path deterministic.- Image identification relies on single
RepoTags[0]assumption. - Trust boundary and permission minimization are not explicitly codified as requirements.
3. Technical Specifications
3.1 Deterministic EARS Requirements (Blocking)
-
WHEN
security-pr.ymlis triggered byworkflow_runwithconclusion == successand upstream eventpull_request, THE SYSTEM SHALL require the expected image artifact to exist and SHALL hard fail the job if the artifact is missing. -
WHEN
security-pr.ymlis triggered byworkflow_runand artifact lookup fails, THEN THE SYSTEM SHALL exit non-zero with a diagnostic that includes: upstream run id, expected artifact name, and reason category (not foundorapi/error). -
WHEN
security-pr.ymlis triggered byworkflow_runand upstream event is notpull_request, THEN THE SYSTEM SHALL hard fail immediately with reason categoryunsupported_upstream_eventand SHALL NOT attempt artifact lookup, image load, or extraction. -
WHEN
security-pr.ymlis triggered byworkflow_dispatch, THE SYSTEM SHALL requireinputs.pr_numberand SHALL hard fail immediately if input is empty. -
WHEN
security-pr.ymlis triggered byworkflow_dispatchwith validinputs.pr_number, THE SYSTEM SHALL resolve artifactpr-image-<pr_number>from the latest successfuldocker-build.ymlrun for that PR and SHALL hard fail if artifact resolution or download fails. -
WHEN artifact image is loaded, THE SYSTEM SHALL derive a canonical local image alias (
charon:artifact) from validated load result and SHALL use only that alias fordocker createin artifact-based paths. -
WHEN artifact metadata parsing is required, THE SYSTEM SHALL NOT depend only on
RepoTags[0]; it SHALL validate all available repo tags and SHALL support fallback selection using docker load image ID when tags are absent/corrupt. -
IF no valid tag and no valid load image ID can be resolved, THEN THE SYSTEM SHALL hard fail before extraction.
-
WHEN event is
pull_requestorpush, THE SYSTEM SHALL build and usecharon:localonly and SHALL NOT execute artifact lookup/load logic.
3.2 Deterministic Policy Decisions
Policy A: workflow_run Missing Artifact
Decision: hard fail only.
No skip behavior is allowed for upstream-success workflow_run.
Policy A1: workflow_run Upstream Event Contract
Decision: upstream event MUST be pull_request.
If upstream event is push or any non-PR event, fail immediately with
unsupported_upstream_event; no artifact path execution is allowed.
Policy B: workflow_dispatch
Decision: artifact-only manual replay.
No local-build fallback is allowed for workflow_dispatch. Required input is
pr_number; missing input is immediate hard fail.
3.3 Image Selection Hardening Contract
For step Load Docker image in .github/workflows/security-pr.yml:
- Validate artifact file exists and is readable tar.
- Parse
manifest.jsonand iterate all candidate tags underRepoTags[]. - Run
docker loadand capture structured output. - Resolve source image by deterministic priority:
- First valid tag from
RepoTags[]that exists locally after load. - Else image ID extracted from
docker loadoutput (if present). - Else fail.
- First valid tag from
- Retag resolved source to
charon:artifact. - Emit outputs:
image_ref=charon:artifactsource_image_ref=<resolved tag or image id>source_resolution_mode=manifest_tag|load_image_id
3.4 CI Security Hardening Requirements
For job security-scan in .github/workflows/security-pr.yml:
-
THE SYSTEM SHALL enforce least-privilege permissions by default:
contents: readactions: readsecurity-events: write- No additional write scopes unless explicitly required.
-
THE SYSTEM SHALL restrict
pull-requests: writeusage to only steps that require PR annotations/comments. If no such step exists, this permission SHALL be removed. -
THE SYSTEM SHALL enforce workflow_run trust boundary guards:
- Upstream workflow name must match expected producer.
- Upstream conclusion must be
success. - Upstream event must be
pull_requestonly. - Upstream head repository must equal
${{ github.repository }}(same-repo trust boundary), otherwise hard fail.
-
THE SYSTEM SHALL NOT use untrusted
workflow_runpayload values to build shell commands without validation and quoting.
3.5 Step-Level Scope in security-pr.yml
Targeted steps:
Extract PR number from workflow_runValidate workflow_run upstream event contractCheck for PR image artifactSkip if no artifact(to be converted to deterministic fail paths forworkflow_runandworkflow_dispatch)Load Docker imageExtract charon binary from container
3.6 Event Data Flow (Deterministic)
pull_request/push
-> Build Docker image (Local)
-> image_ref=charon:local
-> Extract /app/charon
-> Trivy scan
workflow_run (upstream success only)
-> Assert upstream event == pull_request (hard fail if false)
-> Require artifact exists (hard fail if missing)
-> Load/validate image
-> image_ref=charon:artifact
-> Extract /app/charon
-> Trivy scan
workflow_dispatch
-> Require pr_number input (hard fail if missing)
-> Resolve pr-image-<pr_number> artifact (hard fail if missing)
-> Load/validate image
-> image_ref=charon:artifact
-> Extract /app/charon
-> Trivy scan
3.7 Error Handling Matrix
| Step | Condition | Required Behavior |
|---|---|---|
| Validate workflow_run upstream event contract | workflow_run upstream event is not pull_request |
Hard fail with unsupported_upstream_event; stop before artifact lookup |
| Check for PR image artifact | workflow_run upstream success but artifact missing |
Hard fail with run id + artifact name |
| Extract PR number from workflow_run | workflow_dispatch and empty inputs.pr_number |
Hard fail with input requirement message |
| Load Docker image | Missing/corrupt charon-pr-image.tar |
Hard fail before docker load |
| Load Docker image | Missing/corrupt manifest.json |
Attempt load-image-id fallback; fail if unresolved |
| Load Docker image | No valid RepoTags[] and no load image id |
Hard fail |
| Extract charon binary from container | Empty/invalid image_ref |
Hard fail before docker create |
| Extract charon binary from container | /app/charon missing |
Hard fail with chosen image reference |
3.8 API/DB Changes
No backend API, frontend, or database schema changes.
4. Implementation Plan
Phase 1: Playwright Impact Check
- Mark Playwright scope as N/A because this change is workflow-only.
- Record N/A rationale in PR description.
Phase 2: Deterministic Event Policies
File: .github/workflows/security-pr.yml
- Convert ambiguous skip/fail logic to hard-fail policy for
workflow_runmissing artifact after upstream success. - Enforce deterministic
workflow_dispatchpolicy:- Required
pr_numberinput. - Artifact-only replay path.
- No local fallback.
- Required
- Enforce PR-only
workflow_runevent contract:- Upstream event must be
pull_request. - Upstream
pushor any non-PR event hard fails withunsupported_upstream_event.
- Upstream event must be
Phase 3: Image Selection Hardening
File: .github/workflows/security-pr.yml
- Harden
Load Docker imagewith manifest validation and multi-tag handling. - Add fallback resolution via docker load image ID.
- Emit explicit outputs for traceability (
source_resolution_mode). - Ensure extraction consumes only selected alias (
charon:artifact).
Phase 4: CI Security Hardening
File: .github/workflows/security-pr.yml
- Reduce job permissions to least privilege.
- Remove/conditionalize
pull-requests: writeif not required. - Add workflow_run trust-boundary guard conditions and explicit fail messages.
Phase 5: Validation
pre-commit run actionlint --files .github/workflows/security-pr.yml- Simulate deterministic paths (or equivalent CI replay) for all matrix cases.
- Verify logs show chosen
source_image_refandsource_resolution_mode.
5. Validation Matrix
| ID | Trigger Path | Scenario | Expected Result |
|---|---|---|---|
| V1 | workflow_run |
Upstream success + artifact present | Pass, uses charon:artifact |
| V2 | workflow_run |
Upstream success + artifact missing | Hard fail (non-zero) |
| V3 | workflow_run |
Upstream success + artifact manifest corrupted | Hard fail after validation/fallback attempt |
| V4 | workflow_run |
Upstream success + upstream event push |
Hard fail with unsupported_upstream_event |
| V5 | pull_request |
Direct PR trigger | Pass, uses charon:local, no artifact lookup |
| V6 | push |
Direct push trigger | Pass, uses charon:local, no artifact lookup |
| V7 | workflow_dispatch |
Missing pr_number input |
Hard fail immediately |
| V8 | workflow_dispatch |
Valid pr_number + artifact exists |
Pass, uses charon:artifact |
| V9 | workflow_dispatch |
Valid pr_number + artifact missing |
Hard fail |
| V10 | workflow_run |
Upstream from untrusted repository context | Hard fail by trust-boundary guard |
6. Acceptance Criteria
- Plan states unambiguous hard-fail behavior for missing artifact on
workflow_runafter upstreampull_requestsuccess. - Plan states
workflow_runevent contract is PR-only and that upstreampushis a deterministic hard-fail contract violation. - Plan states one deterministic
workflow_dispatchpolicy in SHALL terms: requiredpr_number, artifact-only path, no local fallback. - Plan defines robust image resolution beyond
RepoTags[0], including load-image-id fallback and deterministic aliasing. - Plan includes least-privilege permissions and explicit workflow_run trust boundary constraints.
- Plan includes validation coverage for
pull_requestand directpushlocal paths plus negative paths: unsupported upstream event, missing dispatch input, missing artifact, corrupted/missing manifest. - Root cause remains explicit: image-reference mismatch inside
.github/workflows/security-pr.ymlafter successful artifact load.
7. Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Overly strict dispatch policy blocks ad-hoc scans | Medium | Document explicit manual replay contract in workflow description |
| PR-only workflow_run contract fails upstream push-triggered runs | Medium | Intentional contract enforcement; document unsupported_upstream_event and route push scans through direct push path |
| Manifest parsing edge cases | Medium | Multi-source resolver with load-image-id fallback |
| Permission tightening breaks optional PR annotations | Low | Make PR-write permission step-scoped only if needed |
| Trust-boundary guards reject valid internal events | Medium | Add clear diagnostics and test cases V1/V10 |
8. PR Slicing Strategy
Decision
Single PR.
Trigger Reasons
- Change is isolated to one workflow (
security-pr.yml). - Deterministic policy + hardening are tightly coupled and safest together.
- Split PRs would create temporary policy inconsistency.
Ordered Slice
PR-1: Deterministic Policy and Security Hardening for security-pr.yml
Scope:
- Deterministic missing-artifact handling (
workflow_runhard fail). - Deterministic
workflow_dispatchartifact-only policy. - Hardened image resolution and aliasing.
- Least-privilege + trust-boundary constraints.
- Validation matrix execution evidence.
Files:
.github/workflows/security-pr.ymldocs/plans/current_spec.md
Dependencies:
.github/workflows/docker-build.ymlartifact naming contract unchanged.
Validation Gates:
- actionlint passes.
- Validation matrix V1-V10 results captured.
- No regression to
ghcr.io/...:pr-<num> not foundpattern.
Rollback / Contingency:
- Revert PR-1 if trust-boundary guards block legitimate same-repo runs.
- Keep hard-fail semantics; adjust guard predicate, not policy.
9. Handoff
After approval, implementation handoff to Supervisor SHALL include:
- Exact step-level edits required in
.github/workflows/security-pr.yml. - Proof logs for each failed/pass matrix case.
- Confirmation that no files outside plan scope were required.
- Require explicit evidence that artifact path no longer performs GHCR PR tag reconstruction.