Files

GitHub Actions 3336aae2a0 chore: enforce local patch coverage as a blocking DoD gate

- Added ~40 backend tests covering uncovered branches in CrowdSec
  dashboard handlers (error paths, validation, export edge cases)
- Patch coverage improved from 81.5% to 98.3%, exceeding 90% threshold
- Fixed DoD ordering: coverage tests now run before the patch report
  (the report requires coverage artifacts as input)
- Rewrote the local patch coverage DoD step in both the Management agent
  and testing instructions to clarify purpose, prerequisites, required
  action on findings, and blocking gate semantics
- Eliminated ambiguous "advisory" language that allowed agents to skip
  acting on uncovered lines

2026-03-25 19:33:19 +00:00

13 KiB

Raw Blame History

applyTo, description

applyTo	description
**	Strict protocols for test execution, debugging, and coverage validation.

Testing Protocols

Governance Note: This file is subject to the precedence hierarchy defined in .github/instructions/copilot-instructions.md. When conflicts arise, canonical instruction files take precedence over agent files and operator documentation.

0. E2E Verification First (Playwright)

MANDATORY: Before running unit tests, verify the application UI/UX functions correctly end-to-end.

0.5 Local Patch Coverage Report (After Coverage Tests)

MANDATORY: After running backend and frontend coverage tests (which generate backend/coverage.txt and frontend/coverage/lcov.info), run the local patch report to identify uncovered lines in changed files.

Purpose: Overall coverage can be healthy while the specific lines you changed are untested. This step catches that gap. If uncovered lines are found in feature code, add targeted tests before completing the task.

Prerequisites: Coverage artifacts must exist before running the report:

backend/coverage.txt — generated by scripts/go-test-coverage.sh
frontend/coverage/lcov.info — generated by scripts/frontend-test-coverage.sh

Run one of the following from /projects/Charon:

# Preferred (task)
Test: Local Patch Report

# Script
bash scripts/local-patch-report.sh

Required output artifacts:

test-results/local-patch-report.md
test-results/local-patch-report.json

Action on results: If patch coverage for any changed file is below 90%, add tests targeting the uncovered changed lines. Re-run coverage and this report to verify improvement. Artifact generation is required for DoD regardless of threshold results.

PREREQUISITE: Start E2E Environment

CRITICAL: Rebuild the E2E container when application or Docker build inputs change. If changes are test-only and the container is already healthy, reuse it. If the container is not running or state is suspect, rebuild.

Rebuild required (application/runtime changes):

Application code or dependencies: backend/, frontend/, backend/go.mod, backend/go.sum, package.json, package-lock.json.
Container build/runtime configuration: Dockerfile, .docker/**, .docker/compose/docker-compose.playwright-*.yml, .docker/docker-entrypoint.sh.
Runtime behavior changes baked into the image.

Rebuild optional (test-only changes):

Playwright tests and fixtures: tests/**.
Playwright config and runners: playwright.config.js, playwright.caddy-debug.config.js.
Documentation or planning files: docs/**, requirements.md, design.md, tasks.md.
CI/workflow changes that do not affect runtime images: .github/workflows/**.

When a rebuild is required (or the container is not running), use:

.github/skills/scripts/skill-runner.sh docker-rebuild-e2e

This step:

Builds the latest Docker image with your code changes
Starts the charon-e2e container with proper environment variables from .env
Exposes required ports: 8080 (app), 2020 (emergency), 2019 (Caddy admin)
Waits for health check to pass

Without this step, tests will fail with:

connect ECONNREFUSED ::1:2020 - Emergency server not running
connect ECONNREFUSED ::1:8080 - Application not running
501 Not Implemented - Container missing required env vars

Testing Scope Clarification

Playwright E2E Tests (UI/UX):

Test user interactions with the React frontend
Verify UI state changes when settings are toggled
Ensure forms submit correctly
Check navigation and page rendering
Port: 8080 (Charon Management Interface)
Default Browser: Firefox (provides best cross-browser compatibility baseline)

Integration Tests (Middleware Enforcement):

Test Cerberus security module enforcement
Verify ACL, WAF, Rate Limiting, CrowdSec actually block/allow requests
Test requests routing through Caddy proxy with full middleware
Port: 80 (User Traffic via Caddy)
Location: backend/integration/ with //go:build integration tag
CI: Runs in separate workflows (cerberus-integration.yml, waf-integration.yml, etc.)

Two Modes: Docker vs Vite

Playwright E2E tests can run in two modes with different capabilities:

Mode	Base URL	Coverage Support	When to Use
Docker	`http://localhost:8080`	❌ No (0% reported)	Integration testing, CI validation
Vite Dev	`http://localhost:5173`	✅ Yes (real coverage)	Local development, coverage collection

Why? The @bgotink/playwright-coverage library uses V8 coverage which requires access to source files. Only the Vite dev server exposes source maps and raw source files needed for coverage instrumentation.

Running E2E Tests (Integration Mode)

For general integration testing without coverage:

# Against Docker container (default)
cd /projects/Charon && npx playwright test --project=chromium --project=firefox --project=webkit

# With explicit base URL
PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --project=chromium --project=firefox --project=webkit

Running E2E Tests with Coverage

IMPORTANT: Use the dedicated skill for coverage collection:

# Recommended: Uses skill that starts Vite and runs against localhost:5173
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage

The coverage skill:

Starts Vite dev server on port 5173
Sets PLAYWRIGHT_BASE_URL=http://localhost:5173
Runs tests with V8 coverage collection
Generates reports in coverage/e2e/ (LCOV, HTML, JSON)

DO NOT expect coverage when running against Docker:

# ❌ WRONG: Coverage will show "Unknown% (0/0)"
PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --coverage

# ✅ CORRECT: Use the coverage skill
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage

Verifying Coverage Locally Before CI

Before pushing code, verify E2E coverage:

Run the coverage skill:

.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage

Check coverage output:

# View HTML report
open coverage/e2e/index.html

# Check LCOV file exists for Codecov
ls -la coverage/e2e/lcov.info

Verify non-zero coverage:

# Should show real percentages, not "0%"
head -20 coverage/e2e/lcov.info

General Guidelines

No Truncation: Never pipe Playwright test output through head, tail, or other truncating commands. Playwright runs interactively and requires user input to quit when piped, causing the command to hang indefinitely.
Why First: If the application is broken at the E2E level, unit tests may need updates. Playwright catches integration issues early.
On Failure: Analyze failures, trace root cause through frontend → backend flow, then fix before proceeding to unit tests.
Scope: Run relevant test files for the feature being modified (e.g., tests/manual-dns-provider.spec.ts).

1. Execution Environment

No Truncation: Never use pipe commands (e.g., head, tail) or flags that limit stdout/stderr. If a test hangs, it likely requires an interactive input or is caught in a loop; analyze the full output to identify the block.
Task-Based Execution: Do not manually construct test strings. Use existing project tasks (e.g., npm test, go test ./...). If a specific sub-module requires frequent testing, generate a new task definition in the project's configuration file (e.g., .vscode/tasks.json) before proceeding.

2. Failure Analysis & Logic Integrity

Evidence-Based Debugging: When a test fails, you must quote the specific error message or stack trace before suggesting a fix.
Bug vs. Test Flaw: Treat the test as the "Source of Truth." If a test fails, assume the code is broken until proven otherwise. Research the original requirement or PR description to verify if the test logic itself is outdated before modifying it.
Zero-Hallucination Policy: Only use file paths and identifiers discovered via the ls or search tools. Never guess a path based on naming conventions.

3. Coverage & Completion

Coverage Gate: A task is not "Complete" until a coverage report is generated.
Threshold Compliance: You must compare the final coverage percentage against the project's threshold (Default: 85% unless specified otherwise). If coverage drops, you must identify the "uncovered lines" and add targeted tests.
Patch Coverage (Suggestion): Codecov reports patch coverage as an indicator. While developers should aim for 100% coverage of modified lines, patch coverage is not a hard requirement and will not block PR approval. If patch coverage is low, consider adding targeted tests to improve the metric.
Review Patch Coverage: When reviewing patch coverage reports, assess whether missing lines represent genuine gaps or are acceptable (e.g., error handling branches, deprecated code paths). Use the report to inform testing decisions, not as an absolute gate.

4. GORM Security Validation (Manual Stage)

Requirement: For any change that touches backend models or database-related logic, the GORM Security Scanner is a mandatory local DoD gate and must pass with zero CRITICAL/HIGH findings.

Policy vs. Automation Reconciliation: "Manual stage" describes execution mechanism only (not automated pre-commit hook); policy enforcement remains process-blocking for DoD. Gate decisions must use check semantics (./scripts/scan-gorm-security.sh --check or equivalent task wiring).

When to Run (Conditional Trigger Matrix)

Mandatory Trigger Paths (Include):

backend/internal/models/** — GORM model definitions
Backend services/repositories with GORM query logic
Database migrations or seeding logic affecting model persistence behavior

Explicit Exclusions:

Docs-only changes (**/*.md, governance documentation)
Frontend-only changes (frontend/**)

Gate Decision Rule: IF any Include path matches, THEN scanner execution in check mode is mandatory DoD gate. IF only Exclude paths match, THEN GORM gate is not required for that change set.

Definition of Done

Before Committing: When modifying trigger paths listed above
Before Opening PR: Verify no security issues introduced
After Code Review: If model-related changes were requested
Blocking Gate: Scanner must pass with zero CRITICAL/HIGH issues before task completion

Running the Scanner

Via VS Code (Recommended for Development):

Open Command Palette (Cmd/Ctrl+Shift+P)
Select "Tasks: Run Task"
Choose "Lint: GORM Security Scan"

Via Pre-commit (Manual Stage):

# Run on all Go files
pre-commit run --hook-stage manual gorm-security-scan --all-files

# Run on staged files only
pre-commit run --hook-stage manual gorm-security-scan

Direct Execution:

# Report mode - Show all issues, exit 0 (always)
./scripts/scan-gorm-security.sh --report

# Check mode - Exit 1 if issues found (use in CI)
./scripts/scan-gorm-security.sh --check

Expected Behavior

Pass (Exit Code 0):

No security issues detected
Proceed with commit/PR

Fail (Exit Code 1):

Issues detected (ID leaks, exposed secrets, DTO embedding, etc.)
Review scanner output for file:line references
Fix issues before committing
See GORM Security Scanner Documentation

Common Issues Detected

🔴 CRITICAL: ID Leak — Numeric ID with json:"id" tag
- Fix: Change to json:"-", use UUID for external reference
🔴 CRITICAL: Exposed Secret — APIKey/Token/Password with JSON tag
- Fix: Change to json:"-" to hide sensitive field
🟡 HIGH: DTO Embedding — Response struct embeds model with exposed ID
- Fix: Use explicit field definitions instead of embedding

Integration Status

Current Stage: Manual (soft launch)

Scanner available for manual invocation
Does not block commits automatically
Developers should run proactively

Future Stage: Blocking (after remediation)

Scanner will block commits with CRITICAL/HIGH issues
CI integration will enforce on all PRs
See GORM Scanner Roadmap

Performance

Execution Time: ~2 seconds per full scan
Fast enough for pre-commit use
No impact on commit workflow when passing

Documentation

Implementation Details: docs/implementation/gorm_security_scanner_complete.md
Specification: docs/plans/gorm_security_scanner_spec.md
QA Report: docs/reports/gorm_scanner_qa_report.md

13 KiB Raw Blame History