19 KiB
Nightly Build Vulnerability Remediation Plan
Date: 2026-04-09
Status: Draft — Awaiting Approval
Scope: Dependency security patches for 5 HIGH + 3 MEDIUM vulnerability groups
Target: Single PR — all changes ship together
Archived: Previous plan (CrowdSec Hub Bootstrapping) → docs/plans/archive/crowdsec-hub-bootstrap-spec.md
1. Problem Statement
The Charon nightly build is failing container image vulnerability scans with 5 HIGH-severity and multiple MEDIUM-severity findings. These vulnerabilities exist across three compiled binaries embedded in the container image:
- Charon backend (
/app/charon) — Go binary built frombackend/go.mod - Caddy (
/usr/bin/caddy) — Built via xcaddy in the Dockerfile Caddy builder stage - CrowdSec (
/usr/local/bin/crowdsec,/usr/local/bin/cscli) — Built from source in the Dockerfile CrowdSec builder stage
Additionally, the nightly branch was synced from development before the Go 1.26.2 bump landed, so the nightly image was compiled with Go 1.26.1 (confirmed in ci_failure.log line 55: GO_VERSION: 1.26.1).
2. Research Findings
2.1 Go Version Audit
All files on development / main already reference Go 1.26.2:
| File | Current Value | Status |
|---|---|---|
backend/go.mod |
go 1.26.2 |
✅ Current |
go.work |
go 1.26.2 |
✅ Current |
Dockerfile (ARG GO_VERSION) |
1.26.2 |
✅ Current |
.github/workflows/nightly-build.yml |
'1.26.2' |
✅ Current |
.github/workflows/codecov-upload.yml |
'1.26.2' |
✅ Current |
.github/workflows/quality-checks.yml |
'1.26.2' |
✅ Current |
.github/workflows/codeql.yml |
'1.26.2' |
✅ Current |
.github/workflows/benchmark.yml |
'1.26.2' |
✅ Current |
.github/workflows/release-goreleaser.yml |
'1.26.2' |
✅ Current |
.github/workflows/e2e-tests-split.yml |
'1.26.2' |
✅ Current |
.github/skills/examples/gorm-scanner-ci-workflow.yml |
'1.26.1' |
❌ Stale |
scripts/install-go-1.26.0.sh |
1.26.0 |
⚠️ Old install script (not used in CI/Docker builds) |
Root Cause of Go stdlib CVEs: The nightly branch's last sync predated the 1.26.2 bump. The next nightly sync from development will propagate 1.26.2 automatically. The only file requiring a fix is the example workflow.
2.2 Vulnerability Inventory
HIGH Severity (must fix — merge-blocking)
| # | CVE / GHSA | Package | Current | Fix | Binary | Dep Type |
|---|---|---|---|---|---|---|
| 1 | CVE-2026-39883 | go.opentelemetry.io/otel/sdk |
v1.40.0 | v1.43.0 | Caddy | Transitive (Caddy plugins → otelhttp → otel/sdk) |
| 2 | CVE-2026-34986 | github.com/go-jose/go-jose/v3 |
v3.0.4 | v3.0.5 | Caddy | Transitive (caddy-security → JWT/JOSE stack) |
| 3 | CVE-2026-34986 | github.com/go-jose/go-jose/v4 |
v4.1.3 | v4.1.4 | Caddy | Transitive (grpc v1.79.3 → go-jose/v4) |
| 4 | CVE-2026-32286 | github.com/jackc/pgproto3/v2 |
v2.3.3 | pgx/v4 v4.18.3 ¹ | CrowdSec | Transitive (CrowdSec → pgx/v4 v4.18.2 → pgproto3/v2) |
¹ pgproto3/v2 has no patched release. Fix requires upstream migration to pgx/v5 (uses pgproto3/v3). See §5 Risk Assessment.
MEDIUM Severity (fix in same pass)
| # | CVE / GHSA | Package(s) | Current | Fix | Binary | Dep Type |
|---|---|---|---|---|---|---|
| 5 | GHSA-xmrv-pmrh-hhx2 | AWS SDK v2: eventstream v1.7.1, cloudwatchlogs v1.57.2, kinesis v1.40.1, s3 v1.87.3 |
See left | Bump all | CrowdSec | Direct deps of CrowdSec v1.7.7 |
| 6 | CVE-2026-32281, -32288, -32289 | Go stdlib | 1.26.1 | 1.26.2 | All (nightly image) | Toolchain |
| 7 | CVE-2026-39882 | OTel HTTP exporters: otlploghttp v0.16.0, otlpmetrichttp v1.40.0, otlptracehttp v1.40.0 |
See left | Bump all | Caddy | Transitive (Caddy plugins → OTel exporters) |
2.3 Dependency Chain Analysis
Backend (backend/go.mod)
charon/backend (direct)
└─ docker/docker v28.5.2+incompatible (direct)
└─ otelhttp v0.68.0 (indirect)
└─ otel/sdk v1.43.0 (indirect) — already at latest
└─ grpc v1.79.3 (indirect)
└─ otlptracehttp v1.42.0 (indirect) ── CVE-2026-39882
Backend resolved versions (verified via go list -m -json):
| Package | Version | Type |
|---|---|---|
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp |
v1.42.0 | indirect |
google.golang.org/grpc |
v1.79.3 | indirect |
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp |
v0.68.0 | indirect |
Not present in backend: go-jose/v3, go-jose/v4, otel/sdk, pgproto3/v2, AWS SDK, otlploghttp, otlpmetrichttp.
CrowdSec Binary (Dockerfile crowdsec-builder stage)
Source: CrowdSec v1.7.7 go.mod (verified via git clone --depth 1 --branch v1.7.7):
crowdsec v1.7.7
└─ pgx/v4 v4.18.2 (direct) → pgproto3/v2 v2.3.3 (indirect) ── CVE-2026-32286
└─ aws-sdk-go-v2/service/s3 v1.87.3 (direct) ── GHSA-xmrv-pmrh-hhx2
└─ aws-sdk-go-v2/service/cloudwatchlogs v1.57.2 (direct) ── GHSA-xmrv-pmrh-hhx2
└─ aws-sdk-go-v2/service/kinesis v1.40.1 (direct) ── GHSA-xmrv-pmrh-hhx2
└─ aws-sdk-go-v2/aws/protocol/eventstream v1.7.1 (indirect) ── GHSA-xmrv-pmrh-hhx2
└─ otel v1.39.0, otel/metric v1.39.0, otel/trace v1.39.0 (indirect)
Confirmed by Trivy image scan (trivy-image-report.json): pgproto3/v2 v2.3.3 flagged in usr/local/bin/crowdsec and usr/local/bin/cscli.
Caddy Binary (Dockerfile caddy-builder stage)
Built via xcaddy with plugins. go.mod is generated at build time. Vulnerable packages enter via:
xcaddy build (Caddy v2.11.2 + plugins)
└─ caddy-security v1.1.61 → go-jose/v3 (JWT auth stack) ── CVE-2026-34986
└─ grpc (patched to v1.79.3 in Dockerfile) → go-jose/v4 v4.1.3 ── CVE-2026-34986
└─ Caddy/plugins → otel/sdk v1.40.0 ── CVE-2026-39883
└─ Caddy/plugins → otlploghttp v0.16.0, otlpmetrichttp v1.40.0, otlptracehttp v1.40.0 ── CVE-2026-39882
3. Technical Specifications
3.1 Backend go.mod Changes
File: backend/go.mod (+ backend/go.sum auto-generated)
cd backend
# Upgrade grpc to v1.80.0 (security patches for transitive deps)
go get google.golang.org/grpc@v1.80.0
# CVE-2026-39882: OTel HTTP exporter (backend only has otlptracehttp)
go get go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp@v1.43.0
go mod tidy
Expected go.mod diff:
google.golang.org/grpcv1.79.3 → v1.80.0go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttpv1.42.0 → v1.43.0
3.2 Dockerfile — Caddy Builder Stage Patches
File: Dockerfile, within the caddy-builder RUN bash -c '...' block, in the Stage 2: Apply security patches section.
Add after the existing go get golang.org/x/net@v${XNET_VERSION}; line:
# CVE-2026-34986: go-jose JOSE/JWT validation bypass
# Fix in v3.0.5 and v4.1.4. Pin here until caddy-security ships fix.
# renovate: datasource=go depName=github.com/go-jose/go-jose/v3
go get github.com/go-jose/go-jose/v3@v3.0.5; \
# renovate: datasource=go depName=github.com/go-jose/go-jose/v4
go get github.com/go-jose/go-jose/v4@v4.1.4; \
# CVE-2026-39883: OTel SDK resource leak
# Fix in v1.43.0. Pin here until Caddy ships with updated OTel.
# renovate: datasource=go depName=go.opentelemetry.io/otel/sdk
go get go.opentelemetry.io/otel/sdk@v1.43.0; \
# CVE-2026-39882: OTel HTTP exporter request smuggling
# renovate: datasource=go depName=go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp
go get go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp@v0.19.0; \
# renovate: datasource=go depName=go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp
go get go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp@v1.43.0; \
# renovate: datasource=go depName=go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
go get go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp@v1.43.0; \
Update existing grpc patch line from v1.79.3 → v1.80.0:
# Before:
go get google.golang.org/grpc@v1.79.3; \
# After:
# CVE-2026-33186: gRPC-Go auth bypass (fixed in v1.79.3)
# CVE-2026-34986: go-jose/v4 transitive fix (requires grpc >= v1.80.0)
# renovate: datasource=go depName=google.golang.org/grpc
go get google.golang.org/grpc@v1.80.0; \
3.3 Dockerfile — CrowdSec Builder Stage Patches
File: Dockerfile, within the crowdsec-builder RUN block that patches dependencies.
Add after the existing go get golang.org/x/net@v${XNET_VERSION} line:
# CVE-2026-32286: pgproto3/v2 buffer overflow (no v2 fix exists; bump pgx/v4 to latest patch)
# renovate: datasource=go depName=github.com/jackc/pgx/v4
go get github.com/jackc/pgx/v4@v4.18.3 && \
# GHSA-xmrv-pmrh-hhx2: AWS SDK v2 event stream injection
# renovate: datasource=go depName=github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream
go get github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream@v1.7.8 && \
# renovate: datasource=go depName=github.com/aws/aws-sdk-go-v2/service/cloudwatchlogs
go get github.com/aws/aws-sdk-go-v2/service/cloudwatchlogs@v1.68.0 && \
# renovate: datasource=go depName=github.com/aws/aws-sdk-go-v2/service/kinesis
go get github.com/aws/aws-sdk-go-v2/service/kinesis@v1.43.5 && \
# renovate: datasource=go depName=github.com/aws/aws-sdk-go-v2/service/s3
go get github.com/aws/aws-sdk-go-v2/service/s3@v1.99.0 && \
CrowdSec grpc already at v1.80.0 — no change needed.
3.4 Example Workflow Fix
File: .github/skills/examples/gorm-scanner-ci-workflow.yml (line 28)
# Before:
go-version: "1.26.1"
# After:
go-version: "1.26.2"
3.5 Go Stdlib CVEs (nightly branch — no code change needed)
The nightly workflow syncs development → nightly via git merge --ff-only. Since development already has Go 1.26.2 everywhere:
- Dockerfile
ARG GO_VERSION=1.26.2✓ - All CI workflows
GO_VERSION: '1.26.2'✓ backend/go.modgo 1.26.2✓
The next nightly run at 09:00 UTC will automatically propagate Go 1.26.2 to the nightly branch and rebuild the image.
4. Implementation Plan
Phase 1: Playwright Tests (N/A)
No UI/UX changes — this is a dependency-only update. Existing E2E tests validate runtime behavior.
Phase 2: Backend Implementation
| Task | File(s) | Action |
|---|---|---|
| 2.1 | backend/go.mod, backend/go.sum |
Run go get commands from §3.1 |
| 2.2 | Verify build | cd backend && go build ./cmd/api |
| 2.3 | Verify vet | cd backend && go vet ./... |
| 2.4 | Verify tests | cd backend && go test ./... |
| 2.5 | Verify vulns | cd backend && govulncheck ./... |
Phase 3: Dockerfile Implementation
| Task | File(s) | Action |
|---|---|---|
| 3.1 | Dockerfile (caddy-builder, ~L258-280) |
Add go-jose v3/v4, OTel SDK, OTel exporter patches per §3.2 |
| 3.2 | Dockerfile (caddy-builder, ~L270) |
Update grpc patch v1.79.3 → v1.80.0 |
| 3.3 | Dockerfile (crowdsec-builder, ~L360-370) |
Add pgx, AWS SDK patches per §3.3 |
| 3.3a | CrowdSec binaries | After patching deps, run go build on CrowdSec binaries before full Docker build for faster compilation feedback |
| 3.4 | Dockerfile |
Verify docker build . completes successfully (amd64) |
Phase 4: CI / Misc Fixes
| Task | File(s) | Action |
|---|---|---|
| 4.1 | .github/skills/examples/gorm-scanner-ci-workflow.yml |
Bump Go version 1.26.2 → 1.26.2 |
Phase 5: Validation
| Task | Validation |
|---|---|
| 5.1 | cd backend && go build ./cmd/api — compiles cleanly |
| 5.2 | cd backend && go test ./... — all tests pass |
| 5.3 | cd backend && go vet ./... — no issues |
| 5.4 | cd backend && govulncheck ./... — 0 findings |
| 5.5 | docker build -t charon:vuln-fix . — image builds for amd64 |
| 5.6 | Trivy scan on built image: docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy:latest image --severity CRITICAL,HIGH charon:vuln-fix — 0 HIGH (pgproto3/v2 excepted) |
| 5.7 | Container health: docker run -d -p 8080:8080 charon:vuln-fix && curl -f http://localhost:8080/health |
| 5.8 | E2E Playwright tests pass against rebuilt container |
5. Risk Assessment
Low Risk
| Change | Risk | Rationale |
|---|---|---|
go-jose/v3 v3.0.4 → v3.0.5 |
Low | Security patch release only |
go-jose/v4 v4.1.3 → v4.1.4 |
Low | Security patch release only |
otel/sdk v1.40.0 → v1.43.0 (Caddy) |
Low | Minor bumps, backwards compatible |
otlptracehttp v1.42.0 → v1.43.0 (backend) |
Low | Minor bump |
| OTel exporters (Caddy) | Low | Minor/patch bumps |
| Go version example fix | None | Non-runtime file |
Medium Risk
| Change | Risk | Mitigation |
|---|---|---|
grpc v1.79.3 → v1.80.0 |
Medium | Minor version bump. gRPC is indirect — Charon doesn't use gRPC directly. Run full test suite. Verify Caddy and CrowdSec still compile. |
| AWS SDK major bumps (s3 v1.87→v1.99, cloudwatchlogs v1.57→v1.68, kinesis v1.40→v1.43) | Medium | CrowdSec build may fail if internal APIs changed between versions. Mitigate: run go mod tidy after patches and verify CrowdSec binaries compile. Note: AWS SDK Go v2 packages use independent semver within the v1.x.x line — these are minor version bumps, not major API breaks. |
pgx/v4 v4.18.2 → v4.18.3 |
Medium | Patch release should be safe. May not fully resolve pgproto3/v2 since no patched v2 exists. |
Known Limitation: pgproto3/v2 (CVE-2026-32286)
The pgproto3/v2 module has no patched release — the fix exists only in pgproto3/v3 (used by pgx/v5). CrowdSec v1.7.7 uses pgx/v4 which depends on pgproto3/v2. Remediation:
- Bump
pgx/v4to v4.18.3 (latest v4 patch) — may transitively resolve the issue - If scanner still flags pgproto3/v2 after the bump: document as accepted risk with upstream tracking
- Monitor CrowdSec releases for
pgx/v5migration - Consider upgrading
CROWDSEC_VERSIONARG if a newer CrowdSec release ships with pgx/v5
6. Acceptance Criteria
cd backend && go build ./cmd/apisucceeds with zero warningscd backend && go test ./...passes with zero failurescd backend && go vet ./...reports zero issuescd backend && govulncheck ./...reports zero findings- Docker image builds successfully for amd64
- Trivy/Grype scan of built image shows 0 new HIGH findings (pgproto3/v2 excepted if upstream unpatched)
- Container starts, health check passes on port 8080
- Existing E2E Playwright tests pass against rebuilt container
- No new compile errors in Caddy or CrowdSec builder stages
backend/go.modshows updated versions for grpc, otlptracehttp
7. Commit Slicing Strategy
Decision: Single PR
Rationale: All changes are dependency version bumps with no feature or behavioral changes. They address a single concern (security vulnerability remediation) and should be reviewed and merged atomically to avoid partial-fix states.
Trigger reasons for single PR:
- All changes are security patches — cannot ship partial fixes
- Changes span backend + Dockerfile + CI config — logically coupled
- No risk of one slice breaking another
- Total diff is small (go.mod/go.sum + Dockerfile patch lines + 1 YAML fix)
PR-1: Nightly Build Vulnerability Remediation
Scope: All changes in §3.1–§3.4
Files modified:
| File | Change Type |
|---|---|
backend/go.mod |
Dependency version bumps (grpc, otlptracehttp) |
backend/go.sum |
Auto-generated checksum updates |
Dockerfile |
Add go get patches in caddy-builder and crowdsec-builder stages |
.github/skills/examples/gorm-scanner-ci-workflow.yml |
Go version 1.26.2 → 1.26.2 |
Dependencies: None (standalone)
Validation gates:
go build/go test/go vet/govulncheckpass- Docker image builds for amd64
- Trivy/Grype scan passes (0 new HIGH)
- E2E tests pass
Rollback: Revert PR. All changes are version pins — reverting restores previous state with no data migration needed.
Post-merge Actions
- Nightly build will automatically sync development → nightly and rebuild the image with all patches
- Monitor next nightly scan for zero HIGH findings
- If pgproto3/v2 still flagged: open tracking issue for CrowdSec pgx/v5 upstream migration
- If any AWS SDK bump breaks CrowdSec compilation: pin to intermediate version and document
8. CI Failure Amendment: pgx/v4 Module Path Mismatch
Date: 2026-04-09
Failure: PR #921 build-and-push job, step crowdsec-builder 7/11
Error: go: github.com/jackc/pgx/v4@v5.9.1: invalid version: go.mod has non-.../v4 module path "github.com/jackc/pgx/v5" (and .../v4/go.mod does not exist) at revision v5.9.1
Root Cause
Dockerfile line 386 specifies go get github.com/jackc/pgx/v4@v5.9.1. This mixes the v4 module path with a v5 version tag. Go's semantic import versioning rejects this because tag v5.9.1 declares module path github.com/jackc/pgx/v5 in its go.mod.
Fix
Dockerfile line 386 — change:
go get github.com/jackc/pgx/v4@v5.9.1 && \
to:
go get github.com/jackc/pgx/v4@v4.18.3 && \
No changes needed to the Renovate annotation (line 385) or the CVE comment (line 384) — both are already correct.
Why v4.18.3
- CrowdSec v1.7.7 uses
github.com/jackc/pgx/v4 v4.18.2(direct dependency) - v4.18.3 is the latest and likely final v4 release
- pgproto3/v2 is archived at v2.3.3 (July 2025) — no fix will be released in the v2 line
- The CVE (pgproto3/v2 buffer overflow) can only be fully resolved by CrowdSec migrating to pgx/v5 upstream
- Bumping pgx/v4 to v4.18.3 gets the latest v4 maintenance patch; the CVE remains an accepted risk per §5
Validation
The same docker build that previously failed at step 7/11 should now pass through the CrowdSec dependency patching stage and proceed to compilation (steps 8-11).
9. Commands Reference
# === Backend dependency upgrades ===
cd /projects/Charon/backend
go get google.golang.org/grpc@v1.80.0
go get go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp@v1.43.0
go mod tidy
# === Validate backend ===
go build ./cmd/api
go test ./...
go vet ./...
govulncheck ./...
# === Docker build (after Dockerfile edits) ===
cd /projects/Charon
docker build -t charon:vuln-fix .
# === Scan built image ===
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy:latest image \
--severity CRITICAL,HIGH \
charon:vuln-fix
# === Quick container health check ===
docker run -d --name charon-vuln-test -p 8080:8080 charon:vuln-fix
sleep 10
curl -f http://localhost:8080/health
docker stop charon-vuln-test && docker rm charon-vuln-test