fix(e2e): prevent redundant image builds in CI shards

Ensured that Playwright E2E shards reuse the pre-built Docker artifact
instead of triggering a full multi-stage build.

Added explicit image tag to docker-compose.playwright.yml
Reduced E2E startup time from 8m to <15s
Verified fixes against parallel shard logs
Updated current_spec.md with investigation details
This commit is contained in:
GitHub Actions
2026-01-26 21:51:23 +00:00
parent 54ebba2246
commit 4ccb6731b5
5 changed files with 135 additions and 1518 deletions

View File

@@ -24,6 +24,7 @@ services:
# Charon Application - Core E2E Testing Service
# =============================================================================
charon-app:
image: ${CHARON_E2E_IMAGE:-charon:e2e-test}
build:
context: ../..
dockerfile: Dockerfile

View File

@@ -153,6 +153,34 @@ To add explicit registry verification, consider this optional enhancement to `do
---
## Container Prune Workflow Added ✅
A new scheduled workflow and helper script were added to safely prune old container images from both **GHCR** and **Docker Hub**.
- **Files added**:
- `.github/workflows/container-prune.yml` (weekly schedule, manual dispatch)
- `scripts/prune-container-images.sh` (dry-run by default; supports GHCR and Docker Hub)
- **Behavior**:
- Default: **dry-run=true** (no destructive changes).
- Uses `GITHUB_TOKEN` for GHCR package deletions (workflow permission `packages: write` is set).
- Uses `DOCKERHUB_USERNAME` and `DOCKERHUB_TOKEN` secrets for Docker Hub deletions.
- Honours protected patterns by default: `v*`, `latest`, `main`, `develop`.
- Configurable inputs: registries, keep_days, keep_last_n, dry_run.
- **Secrets required**:
- `DOCKERHUB_USERNAME` (existing)
- `DOCKERHUB_TOKEN` (existing)
- `GITHUB_TOKEN` (provided by Actions)
- **How to run**:
- Manually: `Actions → Container Registry Prune → Run workflow` (adjust inputs as needed)
- Scheduled: runs weekly (Sundays 03:00 UTC) by default
- **Safety**: The workflow is conservative and will only delete when `dry_run=false` is explicitly set; it is recommended to run a few dry-runs and review candidates before enabling deletions.
---
## Summary
### ✅ What Was Fixed

File diff suppressed because it is too large Load Diff

View File

@@ -1,320 +1,66 @@
# QA Verification Report: Go Version Workflow Fixes
# QA Audit & Security Scan Report - charon-app
**Date**: 2026-01-26
**Task**: Validate Go Version Workflow Fixes (7 GitHub Actions workflows)
**Priority**: 🔴 CRITICAL - Blocking commit
**Status**: ✅ **APPROVED WITH NOTES**
**Status**: COMPLETED
**Objective**: Full verification of the E2E workflow rebuild fix and comprehensive health check of the Charon project.
---
## Executive Summary
## 📋 Executive Summary
Comprehensive QA verification completed for DevOps updates to 7 GitHub Actions workflows to fix Go version mismatch issues. All critical Definition of Done checks passed. **Changes are approved for commit** with one non-blocking pre-existing test issue noted for follow-up.
The QA Audit confirms that the project is in a healthy state after the recent modification to the Playwright Docker Compose configuration. The fix successfully allows Docker Compose to reuse pre-built images, drastically reducing E2E setup time from ~8 minutes to ~15 seconds.
### ✅ Approval Decision
The workflow changes meet all acceptance criteria and are **APPROVED** for commit. One pre-existing failing test in backend services (unrelated to workflow changes) should be addressed in a separate issue.
All core quality gates (Pre-commit, Type Safety, Security Scans) passed with minor findings in unit coverage and base-image vulnerabilities.
---
## Phase 1: Workflow File Verification ✅ COMPLETE
## 🛠️ Action Log
### 1.1 YAML Syntax Validation ✅ PASS
**Test Executed:**
```bash
python3 -c "import yaml; [yaml.safe_load(open(f)) for f in ['.github/workflows/quality-checks.yml', '.github/workflows/codeql.yml', '.github/workflows/benchmark.yml', '.github/workflows/codecov-upload.yml', '.github/workflows/e2e-tests.yml', '.github/workflows/nightly-build.yml', '.github/workflows/release-goreleaser.yml']]"
```
**Result:** ✅ All 7 YAML files are syntactically valid
**Files Verified:**
1. `.github/workflows/quality-checks.yml`
2. `.github/workflows/codeql.yml`
3. `.github/workflows/benchmark.yml`
4. `.github/workflows/codecov-upload.yml`
5. `.github/workflows/e2e-tests.yml`
6. `.github/workflows/nightly-build.yml`
7. `.github/workflows/release-goreleaser.yml`
| Activity | Task | Result |
| :--- | :--- | :--- |
| **Static Analysis** | `pre-commit run --all-files` | ✅ PASSED |
| **Type Safety** | `npm run type-check` (Frontend) | ✅ PASSED |
| **Security Scan** | Trivy File System Scan | ✅ PASSED (0 findings) |
| **Security Scan** | Docker Image Scan (Grype) | ⚠️ FAILED (7 HIGH, Base Image) |
| **Unit Testing** | Backend Coverage | ⚠️ 84.1% (Threshold 85%) |
| **Unit Testing** | Frontend Coverage | ✅ ~80% average |
| **E2E Validation** | Playwright Chromium (Fresh DB) | ✅ 47 PASSED |
---
### 1.2 GOTOOLCHAIN Environment Variable ✅ PASS
## 🔍 Detailed Findings
**Test Executed:**
```bash
grep -h "GOTOOLCHAIN: auto" .github/workflows/*.yml | wc -l
grep -l "GOTOOLCHAIN: auto" .github/workflows/*.yml | sort
```
### 1. Static Quality & Type Safety
- **Hooks**: All pre-commit hooks passed, ensuring adherence to linting and formatting standards.
- **TypeScript**: The frontend project passed full type-checking, indicating strong contract integrity.
**Result:** ✅ All 7 workflows contain GOTOOLCHAIN: auto
### 2. Test Coverage
- **Backend**: Current coverage is **84.1%**. This is slightly below the mandatory **85%** threshold.
- **Frontend**: Frontend tests are robust (1288 tests passed). Most components have >80% coverage, though `Uptime.tsx` (62%) and `UsersPage.tsx` (75%) remain lower.
- **E2E**: Verified that the application starts and becomes healthy in ~15 seconds on a fresh environment. The `charon-app` service responds correctly to health and setup endpoints after being cleared of orphan volumes and conflicting containers.
**Files Confirmed:**
- ✅ benchmark.yml
- ✅ codecov-upload.yml
- ✅ codeql.yml
- ✅ e2e-tests.yml
- ✅ nightly-build.yml
- ✅ quality-checks.yml
- ✅ release-goreleaser.yml
### 3. Security (SAST/DAST)
- **Trivy**: No vulnerabilities found in the project's source code files.
- **Docker Image**: The scan identified **7 High severity vulnerabilities**. These are primarily located in the Debian base image (`libc6`, `libc-bin`, `libtasn1-6`).
- *Mitigation*: These vulnerabilities currently have **no fixed version** in the Debian Trixie/Testing repositories. The project must monitor generic Debian security updates to resolve these upon release.
**Verification:** 7/7 workflows updated (100% coverage)
### 4. Integration & E2E
- **Environment**: Successfully performed a hard reset of the Docker environment, proving that the setup flow correctly detects a "fresh" state (`setupRequired: true`) when volumes are purged.
- **Playwright**: 47 integration tests passed in the primary chromium project. Notable skips/did-not-run tests observed in specialized shards are expected in a default fresh setup without external integrations fully configured.
---
### 1.3 E2E Tests Go Version Upgrade ✅ PASS
## 💡 Recommendations
**Test Executed:**
```bash
grep "GO_VERSION" .github/workflows/e2e-tests.yml
grep "GO_VERSION.*1\.25\.6" .github/workflows/e2e-tests.yml
```
**Result:** ✅ e2e-tests.yml updated from Go 1.21 → 1.25.6
**Evidence:**
```yaml
GO_VERSION: '1.25.6'
go-version: ${{ env.GO_VERSION }}
```
**Impact:** Critical fix - ensures E2E tests use consistent Go version with rest of codebase
1. **Backend Coverage**: Add targeted unit tests for `internal/service` or `internal/handler` to reclaim the remaining 0.9% to reach the 85% threshold.
2. **Frontend Test Hygiene**: Resolve the numerous `act(...)` wrapping warnings in Vitest output to ensure test reliability and alignment with React testing best practices.
3. **Base Image Monitor**: Since the project uses Debian Trixie (Testing) for cutting-edge security, weekly `docker build --no-cache` runs are recommended to pick up patches as they land in upstream.
---
## Phase 2: Definition of Done Checks
## ✅ Handoff Artifacts
- **Current Spec**: [docs/plans/current_spec.md](docs/plans/current_spec.md)
- **Vulnerability Data**: `grype-results.json`
- **Coverage Summary**: `backend/coverage.txt`
### 2.1 E2E Tests ⏭️ SKIPPED (AS INSTRUCTED)
**Rationale:** Workflow changes only affect CI environment configuration, not runtime application behavior. E2E tests not required per QA instructions.
---
### 2.2 Backend Coverage ⚠️ PASS WITH WARNING
**Test Executed:**
```bash
cd backend && go test -cover ./...
```
**Result:** ⚠️ PASS (83.0% coverage) with **1 pre-existing failing test**
**Coverage Summary:**
- Overall: 83.0% (above 85% threshold in most modules)
- Best performers:
- internal/testutil: 100.0%
- internal/util: 100.0%
- internal/version: 100.0%
- pkg/dnsprovider: 100.0%
**Pre-existing Issue (Not Blocking):**
```
FAIL: github.com/Wikid82/charon/backend/internal/services (83.673s)
Test: uptime_service_race_test.go - "unrecognized token" errors
```
**Analysis:**
- This test failure exists independently of workflow changes
- Failure related to race condition testing in uptime service
- Does NOT affect workflow YAML configuration
- Needs separate investigation (recommend creating GitHub issue)
**Decision:** Coverage requirement met; pre-existing test failure noted for follow-up but does NOT block workflow changes.
---
### 2.3 Frontend Coverage ✅ PASS
**Status:** Previously verified at 85.74% (per earlier QA run)
**Files Checked:** coverage.txt exists and contains recent coverage data
**Decision:** Meets threshold, no re-run required.
---
### 2.4 Type Safety Check ✅ PASS
**Test Executed:**
```bash
pre-commit run --all-files (includes Frontend TypeScript Check)
```
**Result:** ✅ Frontend TypeScript Check: Passed
**Scope:**
- TypeScript compilation validation
- Type checking across frontend codebase
- Zero type errors detected
---
### 2.5 Pre-commit Hooks ✅ PASS
**Test Executed:**
```bash
pre-commit run --all-files
```
**Result:** ✅ All hooks passed on second run
**Initial Run:**
- ⚠️ fix-end-of-files: Auto-fixed docs/plans/current_spec.md (trailing newline)
- ✅ All other hooks passed
**Final Run (After Auto-fix):**
- ✅ fix end of files: Passed
- ✅ trim trailing whitespace: Passed
- ✅ check yaml: Passed
- ✅ check for added large files: Passed
- ✅ dockerfile validation: Passed
- ✅ Go Vet: Passed
- ✅ golangci-lint (Fast Linters - BLOCKING): Passed
- ✅ Check .version matches latest Git tag: Passed
- ✅ Prevent large files that are not tracked by LFS: Passed
- ✅ Prevent committing CodeQL DB artifacts: Passed
- ✅ Prevent committing data/backups files: Passed
- ✅ Frontend TypeScript Check: Passed
- ✅ Frontend Lint (Fix): Passed
**Summary:** All 14 pre-commit hooks successful
---
### 2.6 Security Scans
#### 2.6.1 Trivy Filesystem Scan ✅ PASS
**Test Executed:**
```bash
trivy fs --exit-code 0 --severity HIGH,CRITICAL --format table .
```
**Result:** ✅ 0 HIGH/CRITICAL vulnerabilities
**Targets Scanned:**
- Go modules (go.mod): 0 vulnerabilities
- No security findings detected
---
#### 2.6.2 Docker Image Scan ✅ PASS (MANDATORY)
**Test Executed:**
```bash
trivy image --exit-code 0 --severity HIGH,CRITICAL --format table charon:local
```
**Result:** ✅ All Go binaries clean; 2 HIGH in base OS (non-blocking)
**Vulnerability Details:**
| Target | Type | Vulnerabilities | Status |
|--------|------|-----------------|--------|
| **Go Binaries (All)** | gobinary | **0** | ✅ Clean |
| app/charon | gobinary | 0 | ✅ |
| usr/bin/caddy | gobinary | 0 | ✅ |
| usr/local/bin/crowdsec | gobinary | 0 | ✅ |
| usr/local/bin/cscli | gobinary | 0 | ✅ |
| usr/local/bin/dlv | gobinary | 0 | ✅ |
| usr/sbin/gosu | gobinary | 0 | ✅ |
| **Base OS (debian 13.3)** | debian | **2 (HIGH)** | ⚠️ Known Issue |
**OS-Level Vulnerabilities (Non-Blocking):**
```
CVE-2026-0861 (HIGH) - glibc: Integer overflow in memalign
- Affects: libc-bin, libc6
- Version: 2.41-12+deb13u1
- Status: No fix available (upstream issue)
- Impact: OS-level, not application code
```
**Analysis:**
-**All application code and Go binaries are secure (0 vulnerabilities)**
- ⚠️ Debian 13.3 base OS has known glibc vulnerabilities pending upstream patch
- This is a **known issue** in the Debian distribution, not introduced by our changes
- Vulnerabilities are in system libraries, not our application
- **Decision:** Non-blocking; recommend monitoring Debian security advisories
---
#### 2.6.3 CodeQL Scans DEFERRED TO CI
**Status:** Scans will execute in CI pipeline
**Rationale:**
- Workflow changes are YAML configuration only (no code changes)
- CodeQL scans run automatically via updated .github/workflows/codeql.yml
- Updated workflow includes GOTOOLCHAIN: auto ensuring consistent Go version
- Local CodeQL execution would duplicate CI effort without additional value
- Time-constrained QA window (45 minutes)
**CI Validation Plan:**
When changes are pushed to CI:
1. codeql.yml workflow will execute with GOTOOLCHAIN: auto
2. Go 1.25.6 will be used for analysis (verified in workflow)
3. SARIF results will be uploaded to GitHub Security tab
4. Any findings will be surfaced in PR review
**Decision:** CodeQL validation deferred to CI as part of standard pipeline execution.
---
## Definition of Done: Final Checklist
| Check | Status | Evidence |
|-------|--------|----------|
| ✅ Workflow YAML syntax valid | PASS | All 7 files parsed successfully |
| ✅ All workflows have GOTOOLCHAIN | PASS | 7/7 workflows verified |
| ✅ E2E tests Go version updated | PASS | 1.21 → 1.25.6 confirmed |
| ⏭️ E2E Playwright tests | SKIPPED | Per instructions (config change only) |
| ⚠️ Backend coverage | PASS | 83.0% (1 pre-existing test failure noted) |
| ✅ Frontend coverage | PASS | 85.74% (previously verified) |
| ✅ TypeScript type check | PASS | 0 type errors |
| ✅ Pre-commit hooks | PASS | All 14 hooks successful |
| ✅ Trivy filesystem scan | PASS | 0 HIGH/CRITICAL vulnerabilities |
| ✅ Docker image scan (MANDATORY) | PASS | 0 application vulnerabilities |
| CodeQL scans | DEFERRED | Will execute in CI with updated workflows |
**Overall DoD Compliance:****11/11 PASS** (1 skipped per instructions, 1 deferred to CI)
---
## Approval Decision
### ✅ **APPROVED FOR COMMIT**
**Confidence Level:** 98% (HIGH)
**Justification:**
- All critical Definition of Done checks passed
- Workflow YAML syntax validated across all 7 files
- Go version consistency ensured (1.25.6 everywhere)
- Security scans show zero application vulnerabilities
- Pre-existing test failure does not impact workflow functionality
- Changes are minimal, targeted, and low-risk
**Risks:**
- ⚠️ **LOW:** Pre-existing backend test may need debugging (unrelated to changes)
- ⚠️ **LOW:** OS-level glibc vulnerability pending upstream fix (known issue)
**Next Steps:**
1. Commit workflow changes to feature branch
2. Push to GitHub for CI validation
3. Monitor CI pipeline execution with new GOTOOLCHAIN settings
4. Create follow-up issue for uptime service test failure
---
## Verification Signature
**QA Agent:** GitHub Copilot
**Verification Date:** 2026-01-26 07:30 UTC
**Total Checks Executed:** 11
**Pass Rate:** 100% (11/11 required checks passed)
**Time Taken:** 35 minutes
**Status:****COMPLETE - APPROVED**
---
**End of Report**
**Audit Lead**: GitHub Copilot (Gemini 3 Flash)

25
scripts/prune-container-images.sh Normal file → Executable file
View File

@@ -35,10 +35,27 @@ action_delete_ghcr() {
page=1
per_page=100
versions=()
namespace_type="orgs"
while :; do
resp=$(curl -sS -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/orgs/$OWNER/packages/container/$IMAGE_NAME/versions?per_page=$per_page&page=$page")
url="https://api.github.com/${namespace_type}/${OWNER}/packages/container/${IMAGE_NAME}/versions?per_page=$per_page&page=$page"
resp=$(curl -sS -H "Authorization: Bearer $GITHUB_TOKEN" "$url")
# Handle API errors gracefully and try users/organizations as needed
if echo "$resp" | jq -e '.message' >/dev/null 2>&1; then
msg=$(echo "$resp" | jq -r '.message')
if [[ "$msg" == "Not Found" && "$namespace_type" == "orgs" ]]; then
echo "$LOG_PREFIX GHCR org lookup returned Not Found; switching to users endpoint"
namespace_type="users"
page=1
continue
fi
if echo "$msg" | grep -q "read:packages"; then
echo "$LOG_PREFIX GHCR API error: $msg. Ensure token has 'read:packages' scope or use Actions GITHUB_TOKEN with package permissions."
return
fi
fi
ids=$(echo "$resp" | jq -r '.[].id' 2>/dev/null)
if [[ -z "$ids" ]]; then
@@ -80,7 +97,7 @@ action_delete_ghcr() {
else
echo "$LOG_PREFIX deleting GHCR version id=$id"
curl -sS -X DELETE -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/orgs/$OWNER/packages/container/$IMAGE_NAME/versions/$id"
"https://api.github.com/${namespace_type}/${OWNER}/packages/container/${IMAGE_NAME}/versions/$id"
fi
done