diff --git a/.github/workflows/docker-build.yml b/.github/workflows/docker-build.yml index 9ddc46cb..18ffa4c6 100644 --- a/.github/workflows/docker-build.yml +++ b/.github/workflows/docker-build.yml @@ -98,7 +98,7 @@ jobs: type=raw,value=latest,enable={{is_default_branch}} type=raw,value=dev,enable=${{ github.ref == 'refs/heads/development' }} type=raw,value=beta,enable=${{ github.ref == 'refs/heads/feature/beta-release' }} - type=raw,value=pr-${{ github.ref_name }},enable=${{ github.event_name == 'pull_request' }} + type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} type=sha,format=short,enable=${{ github.event_name != 'pull_request' }} - name: Build and push Docker image if: steps.skip.outputs.skip_build != 'true' @@ -127,7 +127,7 @@ jobs: # Determine the image reference based on event type if [ "${{ github.event_name }}" = "pull_request" ]; then - IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:pr-${{ github.ref_name }}" + IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:pr-${{ github.event.pull_request.number }}" echo "Using PR image: $IMAGE_REF" else IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build-and-push.outputs.digest }}" diff --git a/.github/workflows/docker-publish.yml b/.github/workflows/docker-publish.yml index 8cedc971..031c803c 100644 --- a/.github/workflows/docker-publish.yml +++ b/.github/workflows/docker-publish.yml @@ -101,7 +101,7 @@ jobs: type=raw,value=latest,enable={{is_default_branch}} type=raw,value=dev,enable=${{ github.ref == 'refs/heads/development' }} type=raw,value=beta,enable=${{ github.ref == 'refs/heads/feature/beta-release' }} - type=raw,value=pr-${{ github.ref_name }},enable=${{ github.event_name == 'pull_request' }} + type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} type=sha,format=short,enable=${{ github.event_name != 'pull_request' }} - name: Build and push Docker image diff --git a/docs/plans/current_spec.md b/docs/plans/current_spec.md index 91333046..53ea0f82 100644 --- a/docs/plans/current_spec.md +++ b/docs/plans/current_spec.md @@ -1,1079 +1,180 @@ -# CVE-2025-68156 Trivy False Positive Analysis +# PR #421: Docker Image Tag Invalid Reference Format Fix -**Issue:** CVE-2025-68156 (`expr-lang/expr`) reported by Trivy in GitHub Actions despite Dockerfile patch at lines 137-138 -**Date:** December 17, 2025 -**Status:** 🟡 ROOT CAUSE IDENTIFIED - Trivy Scanning Intermediate Build Layers +## Issue Summary ---- +**Problem**: CI/CD pipeline failure with error: -## Executive Summary - -**This is a false positive caused by Trivy scanning methodology.** The vulnerability CVE-2025-68156 in `github.com/expr-lang/expr` is **correctly patched** in the final Docker image, but Trivy detects it when scanning **intermediate build layers** or **cached dependencies** that still contain the vulnerable version. - ---- - -## 1. Investigation Findings - -### 1.1 Dockerfile Analysis - -**Patch Location:** [Dockerfile](Dockerfile#L137-L138) - -```dockerfile -# renovate: datasource=go depName=github.com/expr-lang/expr -go get github.com/expr-lang/expr@v1.17.7 || true; +``` +Using PR image: ghcr.io/wikid82/charon:pr-421/merge +docker: invalid reference format ``` -**Context:** This patch occurs in the `caddy-builder` stage: -- **Stage:** `caddy-builder` (FROM golang:1.25-alpine) -- **Build Strategy:** xcaddy builds Caddy with plugins, then patches transitive dependencies -- **Execution Flow:** - 1. `xcaddy build v${CADDY_VERSION}` creates build environment at `/tmp/buildenv_*` - 2. Script patches `go.mod` in build directory with `go get expr-lang/expr@v1.17.7` - 3. Rebuilds Caddy binary with patched dependencies: `go build -o /usr/bin/caddy` - 4. Only the final binary (`/usr/bin/caddy`) is copied to runtime stage - -**Final Stage:** The runtime image copies only `/usr/bin/caddy` from `caddy-builder`: - -```dockerfile -# Line 261 -COPY --from=caddy-builder /usr/bin/caddy /usr/bin/caddy -``` - -**Key Insight:** The vulnerable dependency exists temporarily in the `caddy-builder` stage's Go module cache but is **not present** in the final runtime image binary. +**Root Cause**: Docker image tags cannot contain forward slashes (`/`). The `github.ref_name` context variable returns `421/merge` for PR merge refs, which when prefixed with `pr-` creates the invalid tag `pr-421/merge`. --- -### 1.2 GitHub Actions Workflow Analysis +## Files Requiring Modification -**Workflow:** [.github/workflows/docker-build.yml](/.github/workflows/docker-build.yml) +### 1. `.github/workflows/docker-build.yml` -#### Build Configuration (Lines 106-120) +#### Location 1: Line 101 - Metadata Tags + +**Current Code (Lines 97-105):** ```yaml -- name: Build and push Docker image - uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6 - with: - context: . - platforms: ${{ github.event_name == 'pull_request' && 'linux/amd64' || 'linux/amd64,linux/arm64' }} - push: ${{ github.event_name != 'pull_request' }} - tags: ${{ steps.meta.outputs.tags }} - labels: ${{ steps.meta.outputs.labels }} - pull: true # Always pull fresh base images to get latest security patches - cache-from: type=gha - cache-to: type=gha,mode=max + with: + images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} + tags: | + type=raw,value=latest,enable={{is_default_branch}} + type=raw,value=dev,enable=${{ github.ref == 'refs/heads/development' }} + type=raw,value=beta,enable=${{ github.ref == 'refs/heads/feature/beta-release' }} + type=raw,value=pr-${{ github.ref_name }},enable=${{ github.event_name == 'pull_request' }} + type=sha,format=short,enable=${{ github.event_name != 'pull_request' }} ``` -**Analysis:** -- ✅ **NO `--no-cache` flag** - The build uses GitHub Actions cache (`type=gha`) -- ✅ **`pull: true`** - Ensures base images are fresh -- ✅ **BuildKit caching enabled** - `cache-to: type=gha,mode=max` stores intermediate layers +**Problem**: `github.ref_name` returns `421/merge` for PRs, creating invalid tag `pr-421/merge`. -#### Trivy Scan Configuration (Lines 122-142) +**Fix**: Use `github.event.pull_request.number` instead, which returns just `421`. ```yaml -- name: Run Trivy scan (table output) - uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # 0.33.1 - with: - image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build-and-push.outputs.digest }} - format: 'table' - severity: 'CRITICAL,HIGH' - exit-code: '0' - -- name: Run Trivy vulnerability scanner (SARIF) - uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # 0.33.1 - with: - image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build-and-push.outputs.digest }} - format: 'sarif' - output: 'trivy-results.sarif' - severity: 'CRITICAL,HIGH' -``` - -**Critical Finding:** Trivy scans the **final pushed image** by digest (`@${{ steps.build-and-push.outputs.digest }}`). - ---- - -### 1.3 Root Cause: Trivy's Scanning Methodology - -#### What Trivy Scans - -Trivy performs **multi-layer analysis** on Docker images: - -1. **All layers in the image history** (including intermediate build stages if present) -2. **Go binaries:** Extracts embedded module information from `go build` output -3. **Filesystem artifacts:** Looks for `go.mod`, `go.sum`, vendored code - -#### Why the False Positive Occurs - -**Hypothesis:** The Caddy binary built with `go build -ldflags "-w -s" -trimpath` may still contain **embedded module metadata** that references the original vulnerable `expr-lang/expr` version pulled by xcaddy's initial dependency resolution. - -**Evidence Supporting This:** -- xcaddy first builds with plugins, which pulls vulnerable `expr-lang/expr` as transitive dependency -- The `go get github.com/expr-lang/expr@v1.17.7` patches `go.mod` -- However, the rebuild may not fully update the module metadata embedded in the binary - -**Alternative Hypothesis:** Trivy may be scanning the **BuildKit layer cache** or **intermediate builder stage layers** that are stored in GitHub Actions cache, not just the final runtime stage. - ---- - -## 2. Verification Steps - -To confirm the root cause, the following tests should be performed: - -### 2.1 Verify Final Binary Dependencies - -```bash -# Pull the published image -docker pull ghcr.io/wikid82/charon:latest - -# Extract the Caddy binary -docker run --rm -v $(pwd):/output ghcr.io/wikid82/charon:latest sh -c "cp /usr/bin/caddy /output/caddy" - -# Check Go module info embedded in binary -go version -m ./caddy | grep expr-lang/expr -``` - -**Expected Result:** Should show `expr-lang/expr v1.17.7` (patched version) or no reference at all if stripped properly. - -### 2.2 Scan Only Runtime Stage - -Build and scan ONLY the final runtime stage without intermediate layers: - -```bash -# Build final stage explicitly -docker build --target final -t charon:runtime-only . - -# Scan with Trivy -trivy image --severity CRITICAL,HIGH charon:runtime-only -``` - -**Expected Result:** If CVE still appears, it's in the binary metadata. If not, it's a layer scanning issue. - -### 2.3 Check Trivy Database Version - -```bash -# Trivy may have outdated CVE database -trivy --version -trivy image --download-db-only + type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} ``` --- -## 3. Recommended Solutions +#### Location 2: Line 130 - Verify Caddy Security Patches Step -### Option 1: Use `--scanners vuln` with Binary Analysis Disabled - -Modify Trivy scan to skip Go binary module scanning: +**Current Code (Lines 127-133):** ```yaml -- name: Run Trivy scan (table output) - uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # 0.33.1 - with: - image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build-and-push.outputs.digest }} - format: 'table' - severity: 'CRITICAL,HIGH' - exit-code: '0' - scanners: 'vuln' # Only scan OS packages, not Go binaries - skip-files: '/usr/bin/caddy' # Skip Caddy binary analysis + # Determine the image reference based on event type + if [ "${{ github.event_name }}" = "pull_request" ]; then + IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:pr-${{ github.ref_name }}" + echo "Using PR image: $IMAGE_REF" + else + IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ steps.build-and-push.outputs.digest }}" + echo "Using digest: $IMAGE_REF" + fi ``` -**Pros:** -- Eliminates false positives from binary metadata -- Focuses on actual runtime vulnerabilities +**Problem**: Same issue - uses `github.ref_name` which contains `/`. -**Cons:** -- May miss real Go binary vulnerabilities - ---- - -### Option 2: Two-Stage Go Module Patching (Recommended) - -Modify the Caddy build process to ensure the patched `go.mod` is used BEFORE any binary is built: - -```dockerfile -# Build Caddy for the target architecture with security plugins. -RUN --mount=type=cache,target=/root/.cache/go-build \ - --mount=type=cache,target=/go/pkg/mod \ - sh -c 'set -e; \ - # Initial xcaddy build to generate go.mod - GOOS=$TARGETOS GOARCH=$TARGETARCH xcaddy build v${CADDY_VERSION} \ - --with github.com/greenpau/caddy-security \ - --with github.com/corazawaf/coraza-caddy/v2 \ - --with github.com/hslatman/caddy-crowdsec-bouncer \ - --with github.com/zhangjiayin/caddy-geoip2 \ - --with github.com/mholt/caddy-ratelimit \ - --output /tmp/caddy-initial || true; \ - # Find build directory - BUILDDIR=$(ls -td /tmp/buildenv_* 2>/dev/null | head -1); \ - if [ ! -d "$BUILDDIR" ]; then \ - echo "Build directory not found"; exit 1; \ - fi; \ - cd "$BUILDDIR"; \ - # Patch dependencies BEFORE building - go get github.com/expr-lang/expr@v1.17.7; \ - go get github.com/quic-go/quic-go@v0.57.1; \ - go get github.com/smallstep/certificates@v0.29.0; \ - go mod tidy; \ - # Clean previous binary - rm -f /tmp/caddy-initial; \ - # Rebuild with fully patched dependencies - GOOS=$TARGETOS GOARCH=$TARGETARCH go build -o /usr/bin/caddy \ - -ldflags "-w -s" -trimpath -tags "nobadger,nomysql,nopgx" .; \ - rm -rf /tmp/buildenv_*' -``` - -**Pros:** -- Ensures binary is built with patched `go.mod` from scratch -- Guarantees no vulnerable metadata in binary - -**Cons:** -- Slightly longer build time (no incremental compilation) - ---- - -### Option 3: Add Trivy Ignore Policy (Temporary) - -Create `.trivyignore` file to suppress the false positive until verification: +**Fix**: ```yaml -# .trivyignore -CVE-2025-68156 # False positive: patched in Dockerfile line 138, binary verified clean + IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:pr-${{ github.event.pull_request.number }}" ``` -**Pros:** -- Immediate fix -- Allows builds to pass while investigating - -**Cons:** -- Masks the issue rather than fixing it -- Requires documentation and periodic review - --- -### Option 4: Build Caddy in Separate Clean Stage +### 2. `.github/workflows/docker-publish.yml` -Use a completely fresh Go environment for the final Caddy build: +> **Note**: This file appears to be a near-duplicate of `docker-build.yml`. Consider consolidating them into a single workflow file. -```dockerfile -# ---- Caddy Dependencies Patcher ---- -FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS caddy-deps -ARG TARGETOS -ARG TARGETARCH -ARG CADDY_VERSION +#### Location 1: Line 104 - Metadata Tags -RUN apk add --no-cache git -RUN go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest +**Current Code (Lines 100-106):** -# Generate go.mod with xcaddy -RUN --mount=type=cache,target=/go/pkg/mod \ - sh -c 'export XCADDY_SKIP_CLEANUP=1; \ - xcaddy build v${CADDY_VERSION} \ - --with github.com/greenpau/caddy-security \ - --with github.com/corazawaf/coraza-caddy/v2 \ - --with github.com/hslatman/caddy-crowdsec-bouncer \ - --with github.com/zhangjiayin/caddy-geoip2 \ - --with github.com/mholt/caddy-ratelimit \ - --output /tmp/caddy || true; \ - BUILDDIR=$(ls -td /tmp/buildenv_* 2>/dev/null | head -1); \ - cp -r "$BUILDDIR" /caddy-src' - -# Patch dependencies -WORKDIR /caddy-src -RUN go get github.com/expr-lang/expr@v1.17.7 && \ - go get github.com/quic-go/quic-go@v0.57.1 && \ - go get github.com/smallstep/certificates@v0.29.0 && \ - go mod tidy - -# ---- Caddy Final Builder ---- -FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS caddy-builder -ARG TARGETOS -ARG TARGETARCH - -COPY --from=caddy-deps /caddy-src /build -WORKDIR /build - -RUN --mount=type=cache,target=/root/.cache/go-build \ - --mount=type=cache,target=/go/pkg/mod \ - GOOS=$TARGETOS GOARCH=$TARGETARCH go build -o /usr/bin/caddy \ - -ldflags "-w -s" -trimpath -tags "nobadger,nomysql,nopgx" . -``` - -**Pros:** -- Complete separation of vulnerable and patched builds -- Clean build environment ensures no contamination - -**Cons:** -- More complex Dockerfile structure -- Additional build stage - ---- - -## 4. Immediate Action Plan - -1. **Verify the vulnerability is actually patched** using Section 2.1 verification steps -2. **Implement Option 2 (Two-Stage Patching)** as the most robust solution -3. **Update Trivy scan** to latest version with `--download-db-only` -4. **Add verification step** in CI to extract and verify Caddy binary dependencies: - - ```yaml - - name: Verify Caddy Dependencies - run: | - docker run --rm ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }} \ - sh -c "caddy version && which go && go version -m /usr/bin/caddy | grep expr-lang || echo 'expr-lang not found or patched'" - ``` - -5. **Document the fix** in commit message and release notes - ---- - -## 5. Additional Context - -### Docker Build Cache Behavior - -The workflow uses **GitHub Actions cache** (`cache-from: type=gha`), which stores: -- Base image layers -- Intermediate build stage outputs -- Go module cache (`/go/pkg/mod`) -- Go build cache (`/root/.cache/go-build`) - -**Impact:** If xcaddy's initial dependency resolution is cached, the `go get` patch might not invalidate that cache layer, causing the vulnerable version to persist in Go's module metadata. - -### BuildKit Multi-Stage Behavior - -When using multi-stage builds: -- Each stage is cached independently -- `COPY --from=` instructions only copy specified paths, not the entire stage -- However, **image metadata** (including layer history) may reference all stages - -**Impact:** Trivy may detect the vulnerable version in the `caddy-builder` stage's cache, even though it's not in the final runtime image. - ---- - -## 6. Conclusion - -| Question | Answer | -|----------|--------| -| Is the CVE actually patched? | ✅ **YES** (Dockerfile line 138) | -| Is the final binary vulnerable? | ❓ **NEEDS VERIFICATION** (likely no) | -| Is Trivy using `--no-cache`? | ❌ **NO** (uses GitHub Actions cache) | -| Why is Trivy reporting the CVE? | 🟡 **Scanning intermediate layers or binary metadata** | -| **Root cause** | Trivy detects vulnerable version in cached build stage or binary module info | -| **Recommended fix** | Option 2: Two-stage Go module patching | -| **Temporary workaround** | Option 3: Add `.trivyignore` entry | - ---- - -*Investigation completed: December 17, 2025* -*Investigator: GitHub Copilot* - ---- - ---- - -# Coverage Compilation Hang Investigation - -**Issue:** Test coverage step hanging after tests complete in PR #421 -**Date:** December 17, 2025 -**Status:** 🔴 ROOT CAUSE IDENTIFIED - `go tool cover` Deadlock - ---- - -## Executive Summary - -**Root Cause:** The coverage compilation step hangs indefinitely AFTER all tests pass. The hang occurs at the `go tool cover -func` command execution in the `scripts/go-test-coverage.sh` script, specifically when processing the coverage file. - -**WHERE:** `scripts/go-test-coverage.sh`, lines 58-60 -**WHY:** Large coverage data file (~85%+ coverage across entire backend) + potentially corrupted coverage.txt file causing `go tool cover` to deadlock during report generation -**BLOCKING:** PR #421 CI/CD pipeline in the `quality-checks.yml` workflow - ---- - -## 1. Exact Location of Hang - -### File: `scripts/go-test-coverage.sh` - -**Hanging Commands (Lines 58-60):** -```bash -go tool cover -func="$COVERAGE_FILE" | tail -n 1 -TOTAL_LINE=$(go tool cover -func="$COVERAGE_FILE" | grep total) -TOTAL_PERCENT=$(echo "$TOTAL_LINE" | awk '{print substr($3, 1, length($3)-1)}') -``` - -**Sequence:** -1. ✅ `go test -race -v -mod=readonly -coverprofile="$COVERAGE_FILE" ./...` - **COMPLETES SUCCESSFULLY** -2. ✅ Tests pass: 289 tests, 85.4% coverage -3. ✅ Coverage file `backend/coverage.txt` is generated -4. ✅ Filtering of excluded packages completes -5. ⛔ **HANG OCCURS HERE:** `go tool cover -func="$COVERAGE_FILE"` - **NEVER RETURNS** - ---- - -## 2. Why It Hangs - -### Primary Cause: `go tool cover` Deadlock - -**Evidence:** -- The `go tool cover` command is waiting for input/output that never completes -- No timeout is configured in the script or workflow -- The coverage file may be malformed or too large for the tool to process -- Race conditions with the `-race` flag can create intermittent coverage data corruption - -### Secondary Factors - -1. **Large Coverage File:** - - Backend has 289 tests across 20 packages - - Coverage file includes line-by-line coverage data - - File size can exceed several MB with `-race` overhead - -2. **Double Execution:** - - Line 58 calls `go tool cover -func` once - - Line 59 calls it AGAIN to grep for "total" - - If the first call hangs, the second never executes - -3. **No Timeout:** - - The workflow has a job-level timeout (30 minutes for build-and-push) - - But the coverage script itself has no timeout - - The hang can persist for the full 30 minutes before CI kills it - -4. **Race Detector Overhead:** - - `-race` flag adds instrumentation that can corrupt coverage data - - Known Go tooling issue when combining `-race` with `-coverprofile` - ---- - -## 3. CI/CD Workflow Analysis - -### Workflow: `.github/workflows/quality-checks.yml` - -**Backend Quality Job (Lines 11-76):** ```yaml -- name: Run Go tests - id: go-tests - working-directory: ${{ github.workspace }} - env: - CGO_ENABLED: 1 - run: | - bash scripts/go-test-coverage.sh 2>&1 | tee backend/test-output.txt - exit ${PIPESTATUS[0]} + with: + images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} + tags: | + type=raw,value=latest,enable={{is_default_branch}} + type=raw,value=dev,enable=${{ github.ref == 'refs/heads/development' }} + type=raw,value=beta,enable=${{ github.ref == 'refs/heads/feature/beta-release' }} + type=raw,value=pr-${{ github.ref_name }},enable=${{ github.event_name == 'pull_request' }} + type=sha,format=short,enable=${{ github.event_name != 'pull_request' }} ``` -**Hanging Step:** The `bash scripts/go-test-coverage.sh` command hangs at the `go tool cover -func` execution. +**Fix**: Same as docker-build.yml: -**Impact:** -- CI job hangs until 30-minute timeout -- PR #421 cannot merge -- Subsequent PRs are blocked -- Developer workflow disrupted - ---- - -## 4. Evidence From PR #421 - -**What We Know:** -- PR #421 adds database corruption guardrails -- All tests pass successfully (289 tests, 0 failures) -- Coverage is 85.4% (meets 85% threshold) -- The hang occurs AFTER test execution completes -- Codecov reports missing coverage (because upload never happens) - -**Why Codecov Shows Missing Coverage:** -1. Tests complete and generate `coverage.txt` -2. Script hangs at `go tool cover -func` -3. Workflow eventually times out or is killed -4. `codecov-upload.yml` workflow never receives the coverage file -5. Codecov reports 0% coverage for PR #421 - ---- - -## 5. Immediate Fix - -### Option 1: Add Timeout to Coverage Commands (RECOMMENDED) - -**Modify `scripts/go-test-coverage.sh` (Lines 58-60):** -```bash -# Add timeout wrapper (60 seconds should be enough) -timeout 60 go tool cover -func="$COVERAGE_FILE" | tail -n 1 || { - echo "Error: go tool cover timed out after 60 seconds" - echo "Coverage file may be corrupted. Tests passed but coverage report failed." - exit 1 -} -TOTAL_LINE=$(timeout 60 go tool cover -func="$COVERAGE_FILE" | grep total) -TOTAL_PERCENT=$(echo "$TOTAL_LINE" | awk '{print substr($3, 1, length($3)-1)}') -``` - -**Benefits:** -- Prevents indefinite hangs -- Fails fast with clear error message -- Allows workflow to continue or fail gracefully - ---- - -### Option 2: Remove Race Detector from Coverage Script - -**Modify `scripts/go-test-coverage.sh` (Line 26):** -```bash -# BEFORE: -if ! go test -race -v -mod=readonly -coverprofile="$COVERAGE_FILE" ./...; then - -# AFTER: -if ! go test -v -mod=readonly -coverprofile="$COVERAGE_FILE" ./...; then -``` - -**Benefits:** -- Reduces chance of corrupted coverage data -- Faster execution (~50% faster without `-race`) -- Still runs all tests with full coverage - -**Trade-off:** -- Race conditions won't be detected in coverage runs -- But `-race` is already run separately in manual hooks and CI - ---- - -### Option 3: Single Coverage Report Call - -**Modify `scripts/go-test-coverage.sh` (Lines 58-60):** -```bash -# Use a single call to go tool cover and parse output once -COVERAGE_OUTPUT=$(timeout 60 go tool cover -func="$COVERAGE_FILE") -echo "$COVERAGE_OUTPUT" | tail -n 1 -TOTAL_LINE=$(echo "$COVERAGE_OUTPUT" | grep total) -TOTAL_PERCENT=$(echo "$TOTAL_LINE" | awk '{print substr($3, 1, length($3)-1)}') -``` - -**Benefits:** -- Only one `go tool cover` invocation (faster) -- Easier to debug if it fails -- Same timeout protection - ---- - -## 6. Root Cause Analysis Summary - -| Question | Answer | -|----------|--------| -| **WHERE does it hang?** | `scripts/go-test-coverage.sh`, lines 58-60, during `go tool cover -func` execution | -| **WHAT hangs?** | The `go tool cover` command processing the coverage file | -| **WHY does it hang?** | Deadlock in `go tool cover` when processing large/corrupted coverage data, no timeout configured | -| **WHEN does it hang?** | AFTER all tests pass successfully, during coverage report generation | -| **WHO is affected?** | PR #421 and all subsequent PRs that trigger the `quality-checks.yml` workflow | - ---- - -## 7. Recommended Action Plan - -### Immediate (Deploy Today): - -1. **Add timeout to coverage commands** in `scripts/go-test-coverage.sh` -2. **Use single coverage report call** to avoid double execution -3. **Test locally** to verify fix - -### Short-Term (This Week): - -1. **Remove `-race` from coverage script** (race detector runs separately anyway) -2. **Add explicit timeout** to workflow job level -3. **Verify Codecov uploads** after fix - -### Long-Term (Future Enhancement): - -1. **Investigate Go tooling bug** with `-race` + `-coverprofile` combination -2. **Consider alternative coverage tools** if issue persists -3. **Add workflow retry logic** for transient failures - ---- - -## 8. Fix Verification Checklist - -After implementing the fix: - -- [ ] Run `scripts/go-test-coverage.sh` locally - should complete in < 60 seconds -- [ ] Verify coverage percentage is calculated correctly -- [ ] Push fix to PR #421 -- [ ] Monitor CI run - should complete without hanging -- [ ] Verify Codecov upload succeeds -- [ ] Check that coverage report shows 85%+ for PR #421 - ---- - -## 9. Prevention - -To prevent this issue in the future: - -1. **Always add timeouts** to long-running commands in scripts -2. **Monitor CI job durations** - investigate any job taking > 5 minutes -3. **Test coverage scripts locally** before pushing changes -4. **Consider pre-commit hook** that runs coverage script to catch issues early -5. **Add workflow notifications** for jobs that exceed expected duration - ---- - -## Dockerfile Scripts Inclusion Check (Dec 17, 2025) - -- Observation: The runtime stage in Dockerfile (base `${CADDY_IMAGE}` → WORKDIR `/app`) copies Caddy, CrowdSec binaries, backend binary (`/app/charon`), frontend build, and `docker-entrypoint.sh`, but does **not** copy the repository `scripts/` directory. No prior stage copies `scripts/` either. -- Impact: `docker exec -it charon /app/scripts/db-recovery.sh` fails after rebuild because `/app/scripts/db-recovery.sh` is absent in the image. -- Minimal fix to apply: Add a copy step in the final stage, e.g. `COPY scripts/ /app/scripts/` followed by `RUN chmod +x /app/scripts/db-recovery.sh` to ensure the recovery script is present and executable inside the container at `/app/scripts/db-recovery.sh`. - ---- - -## 1. Evidence from Container Logs - -### Error Pattern Observed - -```log -2025/12/17 07:44:04 /app/backend/internal/services/uptime_service.go:877 database disk image is malformed -[8.185ms] [rows:0] SELECT * FROM `uptime_heartbeats` WHERE monitor_id = "2b8cea58-b8f9-43fc-abe0-f6a0baba2351" ORDER BY created_at desc LIMIT 60 -``` - -### Affected Monitor IDs (6 total) - -| Monitor UUID | Status Code | Error | -|--------------|-------------|-------| -| `2b8cea58-b8f9-43fc-abe0-f6a0baba2351` | 500 | database disk image is malformed | -| `5523d6b3-e2bf-4727-a071-6546f58e8839` | 500 | database disk image is malformed | -| `264fb47b-9814-479a-bb40-0397f21026fe` | 500 | database disk image is malformed | -| `97ecc308-ca86-41f9-ba59-5444409dee8e` | 500 | database disk image is malformed | -| `cad93a3d-6ad4-4cba-a95c-5bb9b46168cd` | 500 | database disk image is malformed | -| `cdc4d769-8703-4881-8202-4b2493bccf58` | 500 | database disk image is malformed | - -### Working Monitor IDs (8 total - return HTTP 200) - -- `fdbc17bd-a00a-4bde-b2f9-e6db69a55c0a` -- `869aee1a-37f0-437c-b151-72074629af3e` -- `dc254e9c-28b5-4b59-ae9a-3c0378420a5a` -- `33371a73-09a2-4c50-b327-69fab5324728` -- `412f9c0b-8498-4045-97c9-021d6fc2ed7e` -- `bef3866b-dbde-4159-9c40-1fb002ed0396` -- `84329e2b-7f7e-4c8b-a1a6-ca52d3b7e565` -- `edd36d10-0e5b-496c-acea-4e4cf7103369` -- `0b426c10-82b8-4cc4-af0e-2dd5f1082fb2` - ---- - -## 2. Complete File Map - Uptime Feature - -### Frontend Layer (`frontend/src/`) - -| File | Purpose | -|------|---------| -| [pages/Uptime.tsx](frontend/src/pages/Uptime.tsx) | Main Uptime page component, displays MonitorCard grid | -| [api/uptime.ts](frontend/src/api/uptime.ts) | API client functions: `getMonitors()`, `getMonitorHistory()`, `updateMonitor()`, `deleteMonitor()`, `checkMonitor()` | -| [components/UptimeWidget.tsx](frontend/src/components/UptimeWidget.tsx) | Dashboard widget showing uptime summary | -| No dedicated hook | Uses inline `useQuery` in components | - -### Backend Layer (`backend/internal/`) - -| File | Purpose | -|------|---------| -| [api/routes/routes.go](backend/internal/api/routes/routes.go#L230-L240) | Route registration for `/uptime/*` endpoints | -| [api/handlers/uptime_handler.go](backend/internal/api/handlers/uptime_handler.go) | HTTP handlers: `List()`, `GetHistory()`, `Update()`, `Delete()`, `Sync()`, `CheckMonitor()` | -| [services/uptime_service.go](backend/internal/services/uptime_service.go) | Business logic: monitor checking, notification batching, history retrieval | -| [models/uptime.go](backend/internal/models/uptime.go) | GORM models: `UptimeMonitor`, `UptimeHeartbeat` | -| [models/uptime_host.go](backend/internal/models/uptime_host.go) | GORM models: `UptimeHost`, `UptimeNotificationEvent` | - ---- - -## 3. Data Flow Analysis - -### Request Flow: UI → API → DB → Response - -```text -┌─────────────────────────────────────────────────────────────────────────┐ -│ FRONTEND │ -├─────────────────────────────────────────────────────────────────────────┤ -│ 1. Uptime.tsx loads → useQuery(['monitors'], getMonitors) │ -│ 2. For each monitor, MonitorCard renders │ -│ 3. MonitorCard calls useQuery(['uptimeHistory', monitor.id], │ -│ () => getMonitorHistory(monitor.id, 60)) │ -└───────────────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ API CLIENT (frontend/src/api/uptime.ts) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ getMonitorHistory(id: string, limit: number = 50): │ -│ client.get │ -│ (`/uptime/monitors/${id}/history?limit=${limit}`) │ -└───────────────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ BACKEND ROUTES (backend/internal/api/routes/routes.go) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ protected.GET("/uptime/monitors/:id/history", uptimeHandler.GetHistory) │ -└───────────────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ HANDLER (backend/internal/api/handlers/uptime_handler.go) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ func (h *UptimeHandler) GetHistory(c *gin.Context) { │ -│ id := c.Param("id") │ -│ limit, _ := strconv.Atoi(c.DefaultQuery("limit", "50")) │ -│ history, err := h.service.GetMonitorHistory(id, limit) │ -│ if err != nil { │ -│ c.JSON(500, gin.H{"error": "Failed to get history"}) ◄─ ERROR │ -│ return │ -│ } │ -│ c.JSON(200, history) │ -│ } │ -└───────────────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ SERVICE (backend/internal/services/uptime_service.go:875-879) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ func (s *UptimeService) GetMonitorHistory(id string, limit int) │ -│ ([]models.UptimeHeartbeat, error) { │ -│ var heartbeats []models.UptimeHeartbeat │ -│ result := s.DB.Where("monitor_id = ?", id) │ -│ .Order("created_at desc") │ -│ .Limit(limit) │ -│ .Find(&heartbeats) ◄─ GORM QUERY │ -│ return heartbeats, result.Error ◄─ ERROR RETURNED HERE │ -│ } │ -└───────────────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ DATABASE (SQLite via GORM) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ SELECT * FROM uptime_heartbeats │ -│ WHERE monitor_id = "..." │ -│ ORDER BY created_at desc │ -│ LIMIT 60 │ -│ │ -│ ERROR: "database disk image is malformed" │ -└─────────────────────────────────────────────────────────────────────────┘ +```yaml + type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} ``` --- -## 4. Database Schema +## Locations That Are ALREADY CORRECT (No Changes Needed) -### UptimeMonitor Table +The following locations use `github.sha` which is always valid (no slashes): -```go -type UptimeMonitor struct { - ID string `gorm:"primaryKey" json:"id"` // UUID - ProxyHostID *uint `json:"proxy_host_id"` // Optional FK - RemoteServerID *uint `json:"remote_server_id"` // Optional FK - UptimeHostID *string `json:"uptime_host_id"` // FK to UptimeHost - Name string `json:"name"` - Type string `json:"type"` // http, tcp, ping - URL string `json:"url"` - UpstreamHost string `json:"upstream_host"` - Interval int `json:"interval"` // seconds - Enabled bool `json:"enabled"` - Status string `json:"status"` // up, down, pending - LastCheck time.Time `json:"last_check"` - Latency int64 `json:"latency"` // ms - FailureCount int `json:"failure_count"` - MaxRetries int `json:"max_retries"` - // ... timestamps -} +| File | Line | Code | Status | +|------|------|------|--------| +| docker-build.yml | 327 | `docker build -t charon:pr-${{ github.sha }} .` | ✅ OK | +| docker-build.yml | 331 | `CONTAINER=$(docker create charon:pr-${{ github.sha }})` | ✅ OK | +| docker-publish.yml | 267 | `docker build -t charon:pr-${{ github.sha }} .` | ✅ OK | +| docker-publish.yml | 271 | `CONTAINER=$(docker create charon:pr-${{ github.sha }})` | ✅ OK | + +These use `github.sha` (a hex string like `abc1234...`) which never contains slashes. + +--- + +## Proposed Fix Summary + +### Changes Required + +| File | Line | Change | +|------|------|--------| +| `.github/workflows/docker-build.yml` | 101 | `github.ref_name` → `github.event.pull_request.number` | +| `.github/workflows/docker-build.yml` | 130 | `github.ref_name` → `github.event.pull_request.number` | +| `.github/workflows/docker-publish.yml` | 104 | `github.ref_name` → `github.event.pull_request.number` | + +### Result + +- **Before**: `ghcr.io/wikid82/charon:pr-421/merge` (INVALID) +- **After**: `ghcr.io/wikid82/charon:pr-421` (VALID) + +--- + +## Alternative Approaches Considered + +### Option A: Use PR Number (RECOMMENDED) + +```yaml +type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} ``` -### UptimeHeartbeat Table (where corruption exists) +- **Pros**: Clean, human-readable, matches common patterns (`pr-421`) +- **Cons**: None -```go -type UptimeHeartbeat struct { - ID uint `gorm:"primaryKey" json:"id"` // Auto-increment - MonitorID string `json:"monitor_id" gorm:"index"` // UUID FK - Status string `json:"status"` // up, down - Latency int64 `json:"latency"` - Message string `json:"message"` - CreatedAt time.Time `json:"created_at" gorm:"index"` -} +### Option B: Replace Slashes with Dashes + +```yaml +type=raw,value=pr-${{ github.ref_name | replace('/', '-') }},enable=${{ github.event_name == 'pull_request' }} ``` ---- +- **Pros**: Preserves full ref info +- **Cons**: GitHub Actions expressions don't support `replace()` filter. Would require a separate step. -## 5. Root Cause Identification +### Option C: Use Short SHA -### Primary Issue: SQLite Database Corruption - -The error `database disk image is malformed` is a SQLite-specific error indicating: - -- Corruption in the database file's B-tree structure -- Possible causes: - 1. **Disk I/O errors** during write operations - 2. **Unexpected container shutdown** mid-transaction - 3. **File system issues** in Docker volume - 4. **Database file written by multiple processes** (concurrent access without WAL) - 5. **Full disk** causing incomplete writes - -### Why Only Some Monitors Are Affected - -The corruption appears to be **localized to specific B-tree pages** that contain -the heartbeat records for those 6 monitors. SQLite's error occurs when: - -- The query touches corrupted pages -- The index on `monitor_id` or `created_at` has corruption -- The data pages for those specific rows are damaged - -### Evidence Supporting This Conclusion - -1. **Consistent 500 errors** for the same 6 monitor IDs -2. **Other queries succeed** (listing monitors returns 200) -3. **Error occurs at the GORM layer** (service.go:877) -4. **Query itself is correct** (same pattern works for 8 other monitors) -5. **No ID mismatch** - UUIDs are correctly passed from frontend to backend - ---- - -## 6. Recommended Actions - -### Immediate Actions - -1. **Stop the container gracefully** to prevent further corruption: - - ```bash - docker stop charon - ``` - -2. **Backup the current database** before any repair: - - ```bash - docker cp charon:/app/data/charon.db ./charon.db.backup.$(date +%Y%m%d) - ``` - -3. **Check database integrity** from within container: - - ```bash - docker exec -it charon sqlite3 /app/data/charon.db "PRAGMA integrity_check;" - ``` - -4. **Attempt database recovery**: - - ```bash - # Export all data that can be read - sqlite3 /app/data/charon.db ".dump" > dump.sql - # Create new database - sqlite3 /app/data/charon_new.db < dump.sql - # Replace original - mv /app/data/charon_new.db /app/data/charon.db - ``` - -### If Recovery Fails - -5. **Delete corrupted heartbeat records** (lossy but restores functionality): - - ```sql - DELETE FROM uptime_heartbeats WHERE monitor_id IN ( - '2b8cea58-b8f9-43fc-abe0-f6a0baba2351', - '5523d6b3-e2bf-4727-a071-6546f58e8839', - '264fb47b-9814-479a-bb40-0397f21026fe', - '97ecc308-ca86-41f9-ba59-5444409dee8e', - 'cad93a3d-6ad4-4cba-a95c-5bb9b46168cd', - 'cdc4d769-8703-4881-8202-4b2493bccf58' - ); - VACUUM; - ``` - -### Long-Term Prevention - -6. **Enable WAL mode** for better crash resilience (in DB initialization): - - ```go - db.Exec("PRAGMA journal_mode=WAL;") - ``` - -7. **Add periodic VACUUM** to compact database and rebuild indexes - -8. **Consider heartbeat table rotation** - archive old heartbeats to prevent - unbounded growth - ---- - -## 7. Code Quality Notes - -### No Logic Bugs Found - -After tracing the complete data flow: - -- ✅ Frontend correctly passes monitor UUID -- ✅ API route correctly extracts `:id` param -- ✅ Handler correctly calls service with UUID -- ✅ Service correctly queries by `monitor_id` -- ✅ GORM model has correct field types and indexes - -### Potential Improvement: Error Handling - -The handler currently returns generic "Failed to get history" for all errors: - -```go -// Current (hides root cause) -if err != nil { - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to get history"}) - return -} - -// Better (exposes root cause in logs, generic to user) -if err != nil { - logger.Log().WithError(err).WithField("monitor_id", id).Error("GetHistory failed") - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to get history"}) - return -} +```yaml +type=raw,value=pr-${{ github.event.pull_request.head.sha | truncate(7) }},enable=${{ github.event_name == 'pull_request' }} ``` ---- - -## 8. Summary - -| Question | Answer | -|----------|--------| -| Is this a frontend bug? | ❌ No | -| Is this a backend logic bug? | ❌ No | -| Is this an ID mismatch? | ❌ No (UUIDs are consistent) | -| Is this a timing issue? | ❌ No | -| **Is this database corruption?** | ✅ **YES** | -| Affected component | SQLite `uptime_heartbeats` table | -| Root cause | Disk image malformed (B-tree corruption) | -| Immediate fix | Database recovery/rebuild | -| Permanent fix | Enable WAL mode, graceful shutdowns | +- **Pros**: Unique identifier +- **Cons**: Less human-friendly, harder to correlate with PR --- -*Investigation completed: December 17, 2025* -*Investigator: GitHub Copilot* +## Implementation Checklist + +- [ ] Update `.github/workflows/docker-build.yml` line 101 +- [ ] Update `.github/workflows/docker-build.yml` line 130 +- [ ] Update `.github/workflows/docker-publish.yml` line 104 +- [ ] Test by creating a new PR and verifying the image tag is valid +- [ ] Consider consolidating `docker-build.yml` and `docker-publish.yml` (future cleanup) --- -# Build Hang Investigation - CVE Fix +## Testing Plan -**Issue:** Docker build hangs at "finished cleaning storage units" during Caddy build process -**Date:** December 17, 2025 -**Status:** 🔴 CRITICAL BUG IDENTIFIED - Caddy Binary Execution During Build +1. Create a test PR after implementing the fix +2. Verify the workflow step "Extract metadata (tags, labels)" shows tag like `pr-` (no slashes) +3. Verify the "Verify Caddy Security Patches" step can pull the correct image reference +4. Confirm no `invalid reference format` errors in CI logs --- -## Executive Summary - -**The Docker build hangs because the Dockerfile executes the built Caddy binary at line 160** during the verification step. When Caddy runs without a config file, it initializes its TLS subsystem, performs storage cleanup, and then **waits indefinitely** for configuration or termination signals. This is a **blocking operation** that never completes in the build context. - ---- - -## 1. Exact Location of Hang - -### Dockerfile Line 160 (caddy-builder stage) - -```dockerfile -# Verify the build -/usr/bin/caddy version; \ -``` - -**Root Cause:** This line executes the Caddy binary, which: -1. Initializes TLS storage -2. Logs "finished cleaning storage units" -3. **Waits indefinitely** for signals (daemon mode) -4. Never exits → Docker build hangs - ---- - -## 2. The Fix - -### Replace execution with non-blocking check: - -```dockerfile -# Before (HANGS): -/usr/bin/caddy version; \ - -# After (WORKS): -test -x /usr/bin/caddy || exit 1; \ -echo "Caddy binary verified"; \ -``` - -**Rationale:** -- `test -x` checks if binary exists and is executable -- No execution = no hang -- Build verification is implicit (go build would fail if binary was malformed) - ---- - -## 3. Investigation Results Summary - -| Question | Answer | -|----------|--------| -| Where does hang occur? | ✅ Line 160: `/usr/bin/caddy version;` | -| Why does Caddy hang? | ✅ Initializes TLS, waits for signals (daemon mode) | -| Is this xcaddy issue? | ❌ No, xcaddy works correctly | -| **Root cause** | ✅ Executing Caddy binary during build without timeout | -| **Fix** | ✅ Replace with `test -x` check | - ---- - -*Investigation completed: December 17, 2025* -*Investigator: GitHub Copilot* -*Priority: 🔴 CRITICAL - Blocks CVE fix deployment* - - ---- - -# Test Hang Investigation - PR #421 - -**Issue:** `go test ./...` command hangs indefinitely after completing `cmd/api` and `cmd/seed` tests -**Date:** December 17, 2025 -**Status:** 🔴 ROOT CAUSE IDENTIFIED - BackupService Cron Scheduler Never Stops - ---- - -## Executive Summary - -**The test hang occurs because `BackupService.Cron.Start()` creates background goroutines that never terminate.** When running `go test ./...`, all packages are loaded simultaneously, and the `NewBackupService()` constructor starts a cron scheduler that runs indefinitely. The Go test runner waits for all goroutines to finish before completing, but the cron scheduler never exits, causing an indefinite hang. - ---- - -## 1. Exact Location of Problem - -### File: `backend/internal/services/backup_service.go` - -**Line 52:** -```go -s.Cron.Start() -``` - -**Root Cause:** This line starts a cron scheduler with background goroutines that never stop, blocking test completion when running `go test ./...`. - ---- - -## 2. The Hang Explained - -When `go test ./...` runs: -1. All packages load simultaneously -2. `NewBackupService()` is called (during package initialization or test setup) -3. Line 52 starts cron scheduler with background goroutines -4. Go test runner waits for all goroutines to finish -5. Cron goroutines NEVER finish → indefinite hang - -Individual package tests work because they complete before goroutine tracking kicks in. - ---- - -## 3. The Fix - -**Add Start()/Stop() lifecycle methods:** - -```go -// Dont start cron in constructor -func NewBackupService(cfg *config.Config) *BackupService { - // ... existing code ... - // Remove: s.Cron.Start() - return s -} - -// Add explicit lifecycle control -func (s *BackupService) Start() { - s.Cron.Start() -} - -func (s *BackupService) Stop() { - ctx := s.Cron.Stop() - <-ctx.Done() -} -``` - -**Update server initialization to call Start() explicitly.** - ---- - -*Investigation completed: December 17, 2025* -*Priority: 🔴 CRITICAL - Blocks PR #421* +*Plan created: December 17, 2025* +*Priority: 🔴 CRITICAL - Blocks PR #421 CI/CD* diff --git a/docs/reports/qa_report_docker_tag_fix_pr421.md b/docs/reports/qa_report_docker_tag_fix_pr421.md new file mode 100644 index 00000000..c8c4bb45 --- /dev/null +++ b/docs/reports/qa_report_docker_tag_fix_pr421.md @@ -0,0 +1,135 @@ +# QA Report: Docker Image Tag Invalid Reference Format Fix (PR #421) + +**Date**: December 17, 2025 +**Agent**: QA_Security +**Status**: ✅ **PASS** + +--- + +## Summary + +Verified the workflow file changes made to fix the Docker image tag "invalid reference format" error in PR #421. All changes have been correctly implemented. + +--- + +## Issue Recap + +**Problem**: CI/CD pipeline failure with: + +```text +Using PR image: ghcr.io/wikid82/charon:pr-421/merge +docker: invalid reference format +``` + +**Root Cause**: Docker image tags cannot contain forward slashes (`/`). The `github.ref_name` context variable returns `421/merge` for PR merge refs. + +**Solution**: Replace `github.ref_name` with `github.event.pull_request.number` which returns just the PR number (e.g., `421`). + +--- + +## Verification Results + +### 1. Pre-commit Hooks + +| Hook | Status | +|------|--------| +| fix end of files | ✅ Passed | +| trim trailing whitespace | ✅ Passed | +| **check yaml** | ✅ Passed | +| check for added large files | ✅ Passed | +| dockerfile validation | ✅ Passed | +| Go Vet | ✅ Passed | +| check-version-match | ⚠️ Failed (unrelated) | +| check-lfs-large-files | ✅ Passed | +| block-codeql-db-commits | ✅ Passed | +| block-data-backups-commit | ✅ Passed | +| Frontend Lint (Fix) | ✅ Passed | + +> **Note**: The `check-version-match` failure is unrelated to PR #421. It's a version sync issue between `.version` file and Git tags. + +### 2. YAML Syntax Validation + +| File | Status | +|------|--------| +| `.github/workflows/docker-build.yml` | ✅ Valid YAML | +| `.github/workflows/docker-publish.yml` | ✅ Valid YAML | + +### 3. Problematic Pattern Search + +**Search for `github.ref_name` in workflow files**: ✅ **No matches found** + +All instances of `github.ref_name` in Docker tag contexts have been successfully replaced. + +### 4. Correct Pattern Verification + +**Search for `github.event.pull_request.number`**: ✅ **3 matches found (expected)** + +| File | Line | Context | +|------|------|---------| +| `docker-build.yml` | 101 | Metadata tags (PR tag) | +| `docker-build.yml` | 130 | Verify Caddy Security Patches step | +| `docker-publish.yml` | 104 | Metadata tags (PR tag) | + +### 5. Safe Patterns (No Changes Needed) + +The following patterns use `github.sha` which is always valid (hex string, no slashes): + +| File | Line | Code | Status | +|------|------|------|--------| +| docker-build.yml | 327 | `docker build -t charon:pr-${{ github.sha }} .` | ✅ Safe | +| docker-build.yml | 331 | `CONTAINER=$(docker create charon:pr-${{ github.sha }})` | ✅ Safe | +| docker-publish.yml | 267 | `docker build -t charon:pr-${{ github.sha }} .` | ✅ Safe | +| docker-publish.yml | 271 | `CONTAINER=$(docker create charon:pr-${{ github.sha }})` | ✅ Safe | + +--- + +## Changes Verified + +### `.github/workflows/docker-build.yml` + +**Line 101** - Metadata Tags: + +```yaml +type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} +``` + +**Line 130** - Verify Caddy Security Patches: + +```yaml +IMAGE_REF="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:pr-${{ github.event.pull_request.number }}" +``` + +### `.github/workflows/docker-publish.yml` + +**Line 104** - Metadata Tags: + +```yaml +type=raw,value=pr-${{ github.event.pull_request.number }},enable=${{ github.event_name == 'pull_request' }} +``` + +--- + +## Expected Result + +- **Before**: `ghcr.io/wikid82/charon:pr-421/merge` ❌ (INVALID) +- **After**: `ghcr.io/wikid82/charon:pr-421` ✅ (VALID) + +--- + +## Conclusion + +| Check | Result | +|-------|--------| +| Pre-commit (relevant hooks) | ✅ PASS | +| YAML syntax validation | ✅ PASS | +| No remaining `github.ref_name` in tag contexts | ✅ PASS | +| Correct use of `github.event.pull_request.number` | ✅ PASS | +| No other problematic patterns in workflows | ✅ PASS | + +**Overall Status**: ✅ **PASS** + +The PR #421 fix has been correctly implemented and is ready for merge. + +--- + +*Report generated by QA_Security agent*