22 KiB
CI Docker Build Failure Analysis & Fix Plan
Issue: Docker Build workflow failing on PR builds during image artifact save
Workflow: .github/workflows/docker-build.yml
Error: Error response from daemon: reference does not exist
Date: 2026-01-12
Status: Analysis Complete - Ready for Implementation
Executive Summary
The docker-build.yml workflow is failing at the "Save Docker Image as Artifact" step (lines 135-142) for PR builds. The root cause is a mismatch between the image name/tag format used by docker/build-push-action with load: true and the image reference manually constructed in the docker save command.
Impact: All PR builds fail at the artifact save step, preventing the verify-supply-chain-pr job from running.
Fix Complexity: Low - Single step modification to use the exact tag from metadata output instead of manually constructing it.
Root Cause Analysis
1. The Failing Step (Lines 135-142)
Location: .github/workflows/docker-build.yml, lines 135-142
- name: Save Docker Image as Artifact
if: github.event_name == 'pull_request'
run: |
IMAGE_NAME=$(echo "${{ github.repository_owner }}/charon" | tr '[:upper:]' '[:lower:]')
docker save ghcr.io/${IMAGE_NAME}:pr-${{ github.event.pull_request.number }} -o /tmp/charon-pr-image.tar
ls -lh /tmp/charon-pr-image.tar
What Happens:
- Line 140: Normalizes repository owner name to lowercase (e.g.,
Wikid82→wikid82) - Line 141: Constructs the image reference manually:
ghcr.io/${IMAGE_NAME}:pr-${PR_NUMBER} - Line 141: Attempts to save the image using this manually constructed reference
The Problem: The manually constructed image reference assumes the Docker image was loaded with the exact format ghcr.io/wikid82/charon:pr-123, but when docker/build-push-action uses load: true, the actual tag format applied to the local image may differ.
2. The Build Step (Lines 111-123)
Location: .github/workflows/docker-build.yml, lines 111-123
- name: Build and push Docker image
if: steps.skip.outputs.skip_build != 'true'
id: build-and-push
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6
with:
context: .
platforms: ${{ github.event_name == 'pull_request' && 'linux/amd64' || 'linux/amd64,linux/arm64' }}
push: ${{ github.event_name != 'pull_request' }}
load: ${{ github.event_name == 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
pull: true
build-args: |
VERSION=${{ steps.meta.outputs.version }}
BUILD_DATE=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.created'] }}
VCS_REF=${{ github.sha }}
CADDY_IMAGE=${{ steps.caddy.outputs.image }}
Key Parameters for PR Builds:
- Line 117:
push: false→ Image is not pushed to the registry - Line 118:
load: true→ Image is loaded into the local Docker daemon - Line 119:
tags: ${{ steps.meta.outputs.tags }}→ Uses tags generated by the metadata action
Behavior with load: true:
- The image is built and loaded into the local Docker daemon
- Tags from
steps.meta.outputs.tagsare applied to the image - For PR builds, this generates one tag:
ghcr.io/wikid82/charon:pr-123
3. The Metadata Step (Lines 105-113)
Location: .github/workflows/docker-build.yml, lines 105-113
- name: Extract metadata (tags, labels)
if: steps.skip.outputs.skip_build != 'true'
id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=raw,value=latest,enable={{is_default_branch}}
type=raw,value=dev,enable=${{ github.ref == 'refs/heads/development' }}
type=raw,value=beta,enable=${{ github.ref == 'refs/heads/feature/beta-release' }}
type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }}
type=sha,format=short,enable=${{ github.event_name != 'pull_request' }}
For PR builds, only line 111 is enabled:
type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }}
This generates a single tag: ghcr.io/wikid82/charon:pr-123
Note: The IMAGE_NAME is already normalized to lowercase at lines 56-57:
- name: Normalize image name
run: |
IMAGE_NAME=$(echo "${{ env.IMAGE_NAME }}" | tr '[:upper:]' '[:lower:]')
echo "IMAGE_NAME=${IMAGE_NAME}" >> $GITHUB_ENV
So the metadata action receives ghcr.io/wikid82/charon (lowercase) as input.
4. The Critical Issue: Tag Mismatch
When docker/build-push-action uses load: true, the behavior is:
- ✅ Expected: Image is loaded with tags from
steps.meta.outputs.tags→ghcr.io/wikid82/charon:pr-123 - ❌ Reality: The exact tag format depends on Docker Buildx's internal behavior
The docker save command at line 141 tries to save:
ghcr.io/${IMAGE_NAME}:pr-${{ github.event.pull_request.number }}
But this manually reconstructs the tag instead of using the actual tag applied by docker/build-push-action.
Why This Fails:
- The
docker savecommand requires an exact match of the image reference as it exists in the local Docker daemon - If the image is loaded with a slightly different tag format,
docker savethrows:Error response from daemon: reference does not exist
Evidence from Error Log:
Run IMAGE_NAME=$(echo "Wikid82/charon" | tr '[:upper:]' '[:lower:]')
Error response from daemon: reference does not exist
Error: Process completed with exit code 1.
This confirms the docker save command cannot find the image reference constructed at line 141.
5. Job Dependencies Analysis
Complete Workflow Structure:
build-and-push (lines 34-234)
├── Outputs: skip_build, digest
├── Steps:
│ ├── Build image (load=true for PRs)
│ ├── Save image artifact (❌ FAILS HERE at line 141)
│ └── Upload artifact (never reached)
│
test-image (lines 354-463)
├── needs: build-and-push
├── if: ... && github.event_name != 'pull_request'
└── (Not relevant for PRs)
│
trivy-pr-app-only (lines 465-493)
├── if: github.event_name == 'pull_request'
└── (Independent - builds its own image)
│
verify-supply-chain-pr (lines 495-722)
├── needs: build-and-push
├── if: github.event_name == 'pull_request' && needs.build-and-push.result == 'success'
├── Steps:
│ ├── ❌ Download artifact (artifact doesn't exist)
│ ├── ❌ Load image (cannot load non-existent artifact)
│ └── ❌ Scan image (cannot scan non-loaded image)
└── Currently skipped due to build-and-push failure
│
verify-supply-chain-pr-skipped (lines 724-754)
├── needs: build-and-push
└── if: github.event_name == 'pull_request' && needs.build-and-push.outputs.skip_build == 'true'
Dependency Chain Impact:
- ❌
build-and-pushfails at line 141 (docker save) - ❌ Artifact is never uploaded (lines 144-150 are skipped)
- ❌
verify-supply-chain-prcannot download artifact (line 517 fails) - ❌ Supply chain verification never runs for PRs
6. Verification: Why Similar Patterns Work
Line 376 (in test-image job):
- name: Normalize image name
run: |
raw="${{ github.repository_owner }}/${{ github.event.repository.name }}"
IMAGE_NAME=$(echo "$raw" | tr '[:upper:]' '[:lower:]')
echo "IMAGE_NAME=${IMAGE_NAME}" >> $GITHUB_ENV
This job pulls from the registry (line 395):
- name: Pull Docker image
run: docker pull ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
✅ Works because: It pulls a pushed image from the registry, not a locally loaded one.
Line 516 (in verify-supply-chain-pr job):
- name: Normalize image name
run: |
IMAGE_NAME=$(echo "${{ github.repository_owner }}/charon" | tr '[:upper:]' '[:lower:]')
echo "IMAGE_NAME=${IMAGE_NAME}" >> $GITHUB_ENV
✅ Would work if: The artifact existed. This job loads the image from the tar file, which preserves the exact tags.
Key Difference: The failing step tries to save an image before we know its exact tag, while the working patterns either:
- Pull from registry with a known tag
- Load from artifact with preserved tags
The Solution
Option 1: Use Metadata Output Tag (RECOMMENDED ✅)
Strategy: Extract the exact tag from steps.meta.outputs.tags and use it directly in docker save.
Why This Works:
- The
docker/metadata-actiongenerates the tags thatdocker/build-push-actionactually applies to the image - For PR builds, this is:
ghcr.io/<owner>/charon:pr-<number>(normalized, lowercase) - This is the exact tag that exists in the local Docker daemon after
load: true
Rationale:
- Avoids manual tag reconstruction
- Uses the authoritative source of truth for image tags
- Eliminates assumption-based errors
Risk Level: Low - Read-only operation on existing step outputs
Option 2: Inspect Local Images (ALTERNATIVE)
Strategy: Use docker images to discover the actual tag before saving.
Why Not Recommended:
- Adds complexity
- Requires pattern matching or parsing
- Less reliable than using metadata output
Option 3: Override Tag for PRs (FALLBACK)
Strategy: Modify the build step to apply a deterministic local tag for PR builds.
Why Not Recommended:
- Requires more changes (build step + save step)
- Breaks consistency with existing tag patterns
- Downstream jobs expect registry-style tags
Recommended Fix: Option 1
Implementation
File: .github/workflows/docker-build.yml
Location: Lines 135-142 (Save Docker Image as Artifact step)
Before (Current - BROKEN)
- name: Save Docker Image as Artifact
if: github.event_name == 'pull_request'
run: |
IMAGE_NAME=$(echo "${{ github.repository_owner }}/charon" | tr '[:upper:]' '[:lower:]')
docker save ghcr.io/${IMAGE_NAME}:pr-${{ github.event.pull_request.number }} -o /tmp/charon-pr-image.tar
ls -lh /tmp/charon-pr-image.tar
Issue: Manually constructs the image reference, which may not match the actual tag applied by docker/build-push-action.
After (FIXED - Concise Version)
- name: Save Docker Image as Artifact
if: github.event_name == 'pull_request'
run: |
# Extract the first tag from metadata action (PR tag)
IMAGE_TAG=$(echo "${{ steps.meta.outputs.tags }}" | head -n 1)
echo "🔍 Detected image tag: ${IMAGE_TAG}"
# Verify the image exists locally
echo "📋 Available local images:"
docker images --filter "reference=*charon*"
# Save the image using the exact tag from metadata
echo "💾 Saving image: ${IMAGE_TAG}"
docker save "${IMAGE_TAG}" -o /tmp/charon-pr-image.tar
# Verify the artifact was created
echo "✅ Artifact created:"
ls -lh /tmp/charon-pr-image.tar
After (FIXED - Defensive Version for Production)
- name: Save Docker Image as Artifact
if: github.event_name == 'pull_request'
run: |
# Extract the first tag from metadata action (PR tag)
IMAGE_TAG=$(echo "${{ steps.meta.outputs.tags }}" | head -n 1)
if [[ -z "${IMAGE_TAG}" ]]; then
echo "❌ ERROR: No image tag found in metadata output"
echo "Metadata tags output:"
echo "${{ steps.meta.outputs.tags }}"
exit 1
fi
echo "🔍 Detected image tag: ${IMAGE_TAG}"
# Verify the image exists locally
if ! docker image inspect "${IMAGE_TAG}" >/dev/null 2>&1; then
echo "❌ ERROR: Image ${IMAGE_TAG} not found locally"
echo "📋 Available images:"
docker images
exit 1
fi
# Save the image using the exact tag from metadata
echo "💾 Saving image: ${IMAGE_TAG}"
docker save "${IMAGE_TAG}" -o /tmp/charon-pr-image.tar
# Verify the artifact was created
echo "✅ Artifact created:"
ls -lh /tmp/charon-pr-image.tar
Key Changes:
-
Extract exact tag:
IMAGE_TAG=$(echo "${{ steps.meta.outputs.tags }}" | head -n 1)- Uses the first (and only) tag from metadata output
- For PR builds:
ghcr.io/wikid82/charon:pr-123
-
Add debugging:
docker images --filter "reference=*charon*"- Shows available images for troubleshooting
- Helps diagnose tag mismatches in logs
-
Use extracted tag:
docker save "${IMAGE_TAG}" -o /tmp/charon-pr-image.tar- No manual reconstruction
- Guaranteed to match the actual image tag
-
Defensive checks (production version only):
- Verify
IMAGE_TAGis not empty - Verify image exists before attempting save
- Fail fast with clear error messages
- Verify
Why This Works:
- ✅ The
docker/metadata-actionoutput is the authoritative source of tags - ✅ These are the exact tags applied by
docker/build-push-action - ✅ No assumptions or manual reconstruction
- ✅ Works for any repository owner name (uppercase, lowercase, mixed case)
- ✅ Consistent with downstream jobs that expect the same tag format
Null Safety:
- If
steps.meta.outputs.tagsis empty (shouldn't happen),IMAGE_TAGwill be empty - The defensive version explicitly checks for this and fails with a clear message
- The concise version will fail at
docker savewith a clear error about missing image reference
Side Effects & Related Updates
No Changes Needed ✅
The following steps/jobs already handle the image correctly and require no modifications:
-
Upload Image Artifact (lines 144-150)
- ✅ Uses the saved tar file from the previous step
- ✅ No dependency on image tag format
-
verify-supply-chain-pr job (lines 495-722)
- ✅ Downloads and loads the tar file
- ✅ References image using the same normalization logic
- ✅ Will work correctly once artifact exists
-
Load Docker Image step (lines 524-529)
- ✅ Loads from tar file (preserves original tags)
- ✅ No changes needed
Why No Downstream Changes Are Needed
When you save a Docker image to a tar file using docker save, the tar file contains:
- The image layers
- The image configuration
- The exact tags that were applied to the image
When you load the image using docker load -i charon-pr-image.tar, Docker restores:
- All image layers
- The image configuration
- The exact same tags that were saved
Example:
# Save with tag: ghcr.io/wikid82/charon:pr-123
docker save ghcr.io/wikid82/charon:pr-123 -o image.tar
# Load restores the exact same tag
docker load -i image.tar
# Image is now available as: ghcr.io/wikid82/charon:pr-123
docker images ghcr.io/wikid82/charon:pr-123
The verify-supply-chain-pr job references:
IMAGE_REF="ghcr.io/${{ env.IMAGE_NAME }}:pr-${{ github.event.pull_request.number }}"
This will match perfectly because:
IMAGE_NAMEis normalized the same way (lines 516-518)- The PR number is the same
- The loaded image has the exact tag we saved
Testing Plan
Phase 1: Local Verification (Recommended)
Before pushing to CI, verify the fix locally:
# 1. Build a PR-style image locally
docker build -t ghcr.io/wikid82/charon:pr-test .
# 2. Verify the image exists
docker images ghcr.io/wikid82/charon:pr-test
# 3. Save the image
docker save ghcr.io/wikid82/charon:pr-test -o /tmp/test-image.tar
# 4. Verify the tar was created
ls -lh /tmp/test-image.tar
# 5. Load the image in a clean environment
docker rmi ghcr.io/wikid82/charon:pr-test # Remove original
docker load -i /tmp/test-image.tar # Reload from tar
docker images ghcr.io/wikid82/charon:pr-test # Verify it's back
Expected Result: All steps succeed without "reference does not exist" errors.
Phase 2: CI Testing
- Apply the fix to
.github/workflows/docker-build.yml(lines 135-142) - Create a test PR on the
feature/beta-releasebranch - Verify the workflow execution:
- ✅
build-and-pushjob completes successfully - ✅ "Save Docker Image as Artifact" step shows detected tag in logs
- ✅ "Upload Image Artifact" step uploads the tar file
- ✅
verify-supply-chain-prjob runs and downloads the artifact - ✅ "Load Docker Image" step loads the image successfully
- ✅ SBOM generation and vulnerability scanning complete
- ✅
Phase 3: Edge Cases
Test the following scenarios:
-
Different repository owners (uppercase, lowercase, mixed case):
Wikid82/charon→wikid82/charonTestUser/charon→testuser/charonUPPERCASE/charon→uppercase/charon
-
Multiple rapid commits to the same PR:
- Verify no artifact conflicts
- Verify each commit gets its own workflow run
-
Skipped builds (chore commits):
- Verify
verify-supply-chain-pr-skippedruns correctly - Verify feedback comment is posted
- Verify
-
Different PR numbers:
- Single digit (PR #5)
- Double digit (PR #42)
- Triple digit (PR #123)
Phase 4: Rollback Plan
If the fix causes issues:
- Immediate rollback: Revert the commit that applied this fix
- Temporary workaround: Disable artifact save/upload steps:
if: github.event_name == 'pull_request' && false # Temporarily disabled - Investigation: Check GitHub Actions logs for actual image tags:
# Add this step before the save step - name: Debug Image Tags if: github.event_name == 'pull_request' run: | echo "Metadata tags:" echo "${{ steps.meta.outputs.tags }}" echo "" echo "Local images:" docker images
Success Criteria
Functional
- ✅
build-and-pushjob completes successfully for all PR builds - ✅ Docker image artifact is saved and uploaded for all PR builds
- ✅
verify-supply-chain-prjob runs and downloads the artifact - ✅ No "reference does not exist" errors in any step
- ✅ Supply chain verification completes for all PR builds
Observable Metrics
- 📊 Job Success Rate: 100% for
build-and-pushjob on PRs - 📦 Artifact Upload Rate: 100% for PR builds
- 🔒 Supply Chain Verification Rate: 100% for PR builds (excluding skipped)
- ⏱️ Build Time: No significant increase (<30 seconds for artifact save)
Quality
- 🔍 Clear logging of detected image tags
- 🛡️ Defensive error handling (fails fast with clear messages)
- 📝 Consistent with existing patterns in the workflow
Implementation Checklist
Pre-Implementation
- Analyze the root cause (line 141 in docker-build.yml)
- Identify the exact failing step and command
- Review job dependencies and downstream impacts
- Design the fix with before/after comparison
- Document testing plan and success criteria
Implementation
- Apply the fix to
.github/workflows/docker-build.yml(lines 135-142) - Choose between concise or defensive version (recommend defensive for production)
- Commit with message:
fix(ci): use metadata tag for docker save in PR builds - Push to
feature/beta-releasebranch
Testing
- Create a test PR and verify workflow runs successfully
- Check GitHub Actions logs for "🔍 Detected image tag" output
- Verify artifact is uploaded (check Actions artifacts tab)
- Verify
verify-supply-chain-prjob completes successfully - Test edge cases (uppercase owner, different PR numbers)
- Monitor 2-3 additional PR builds for stability
Post-Implementation
- Update CHANGELOG.md with the fix
- Close any related GitHub issues
- Document lessons learned (if applicable)
- Monitor for regressions over next week
Appendix A: Error Analysis Summary
Error Signature
Run IMAGE_NAME=$(echo "Wikid82/charon" | tr '[:upper:]' '[:lower:]')
Error response from daemon: reference does not exist
Error: Process completed with exit code 1.
Error Details
- File:
.github/workflows/docker-build.yml - Job:
build-and-push - Step: "Save Docker Image as Artifact"
- Lines: 135-142
- Failing Command: Line 141 →
docker save ghcr.io/${IMAGE_NAME}:pr-${PR_NUMBER} -o /tmp/charon-pr-image.tar
Error Type
Docker Daemon Error: The Docker daemon cannot find the image reference specified in the docker save command.
Root Cause Categories
| Category | Likelihood | Evidence |
|---|---|---|
| Tag Mismatch | ✅ Most Likely | Manual reconstruction doesn't match actual tag |
| Image Not Loaded | ❌ Unlikely | Build step succeeds |
| Timing Issue | ❌ Unlikely | Steps are sequential |
| Permissions Issue | ❌ Unlikely | Other Docker commands work |
Conclusion: Tag Mismatch is the root cause.
Evidence Supporting Root Cause
- ✅ Build step succeeds (no reported build failures)
- ✅ Error occurs at
docker save(after successful build) - ✅ Manual tag reconstruction (lines 140-141)
- ✅ Inconsistent with docker/build-push-action behavior when
load: true - ✅ Similar patterns work because they either:
- Pull from registry (test-image job)
- Load from artifact (verify-supply-chain-pr job)
Fix Summary
What Changed: Use exact tag from steps.meta.outputs.tags instead of manually constructing it
Why It Works: The metadata action output is the authoritative source of tags applied by docker/build-push-action
Risk Level: Low - Read-only operation on existing step outputs
Appendix B: Relevant Documentation
- Docker Build-Push-Action - Load Option
- Docker Metadata-Action - Outputs
- Docker CLI - save command
- GitHub Actions - Artifacts
- Docker Buildx - Multi-platform builds
END OF ANALYSIS & FIX PLAN