12 KiB
applyTo, description
| applyTo | description |
|---|---|
| ** | Strict protocols for test execution, debugging, and coverage validation. |
Testing Protocols
Governance Note: This file is subject to the precedence hierarchy defined in
.github/instructions/copilot-instructions.md. When conflicts arise, canonical
instruction files take precedence over agent files and operator documentation.
0. E2E Verification First (Playwright)
MANDATORY: Before running unit tests, verify the application UI/UX functions correctly end-to-end.
0.5 Local Patch Coverage Preflight (Before Unit Tests)
MANDATORY: After E2E and before backend/frontend unit coverage runs, generate a local patch report so uncovered changed lines are visible early.
Run one of the following from /projects/Charon:
# Preferred (task)
Test: Local Patch Report
# Script
bash scripts/local-patch-report.sh
Required artifacts:
test-results/local-patch-report.mdtest-results/local-patch-report.json
This preflight is advisory for thresholds during rollout, but artifact generation is required in DoD.
PREREQUISITE: Start E2E Environment
CRITICAL: Rebuild the E2E container when application or Docker build inputs change. If changes are test-only and the container is already healthy, reuse it. If the container is not running or state is suspect, rebuild.
Rebuild required (application/runtime changes):
- Application code or dependencies: backend/, frontend/, backend/go.mod, backend/go.sum, package.json, package-lock.json.
- Container build/runtime configuration: Dockerfile, .docker/**, .docker/compose/docker-compose.playwright-*.yml, .docker/docker-entrypoint.sh.
- Runtime behavior changes baked into the image.
Rebuild optional (test-only changes):
- Playwright tests and fixtures: tests/**.
- Playwright config and runners: playwright.config.js, playwright.caddy-debug.config.js.
- Documentation or planning files: docs/**, requirements.md, design.md, tasks.md.
- CI/workflow changes that do not affect runtime images: .github/workflows/**.
When a rebuild is required (or the container is not running), use:
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e
This step:
- Builds the latest Docker image with your code changes
- Starts the
charon-e2econtainer with proper environment variables from.env - Exposes required ports: 8080 (app), 2020 (emergency), 2019 (Caddy admin)
- Waits for health check to pass
Without this step, tests will fail with:
connect ECONNREFUSED ::1:2020- Emergency server not runningconnect ECONNREFUSED ::1:8080- Application not running501 Not Implemented- Container missing required env vars
Testing Scope Clarification
Playwright E2E Tests (UI/UX):
- Test user interactions with the React frontend
- Verify UI state changes when settings are toggled
- Ensure forms submit correctly
- Check navigation and page rendering
- Port: 8080 (Charon Management Interface)
- Default Browser: Firefox (provides best cross-browser compatibility baseline)
Integration Tests (Middleware Enforcement):
- Test Cerberus security module enforcement
- Verify ACL, WAF, Rate Limiting, CrowdSec actually block/allow requests
- Test requests routing through Caddy proxy with full middleware
- Port: 80 (User Traffic via Caddy)
- Location:
backend/integration/with//go:build integrationtag - CI: Runs in separate workflows (cerberus-integration.yml, waf-integration.yml, etc.)
Two Modes: Docker vs Vite
Playwright E2E tests can run in two modes with different capabilities:
| Mode | Base URL | Coverage Support | When to Use |
|---|---|---|---|
| Docker | http://localhost:8080 |
❌ No (0% reported) | Integration testing, CI validation |
| Vite Dev | http://localhost:5173 |
✅ Yes (real coverage) | Local development, coverage collection |
Why? The @bgotink/playwright-coverage library uses V8 coverage which requires access to source files. Only the Vite dev server exposes source maps and raw source files needed for coverage instrumentation.
Running E2E Tests (Integration Mode)
For general integration testing without coverage:
# Against Docker container (default)
cd /projects/Charon && npx playwright test --project=chromium --project=firefox --project=webkit
# With explicit base URL
PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --project=chromium --project=firefox --project=webkit
Running E2E Tests with Coverage
IMPORTANT: Use the dedicated skill for coverage collection:
# Recommended: Uses skill that starts Vite and runs against localhost:5173
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage
The coverage skill:
- Starts Vite dev server on port 5173
- Sets
PLAYWRIGHT_BASE_URL=http://localhost:5173 - Runs tests with V8 coverage collection
- Generates reports in
coverage/e2e/(LCOV, HTML, JSON)
DO NOT expect coverage when running against Docker:
# ❌ WRONG: Coverage will show "Unknown% (0/0)"
PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --coverage
# ✅ CORRECT: Use the coverage skill
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage
Verifying Coverage Locally Before CI
Before pushing code, verify E2E coverage:
-
Run the coverage skill:
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage -
Check coverage output:
# View HTML report open coverage/e2e/index.html # Check LCOV file exists for Codecov ls -la coverage/e2e/lcov.info -
Verify non-zero coverage:
# Should show real percentages, not "0%" head -20 coverage/e2e/lcov.info
General Guidelines
- No Truncation: Never pipe Playwright test output through
head,tail, or other truncating commands. Playwright runs interactively and requires user input to quit when piped, causing the command to hang indefinitely. - Why First: If the application is broken at the E2E level, unit tests may need updates. Playwright catches integration issues early.
- On Failure: Analyze failures, trace root cause through frontend → backend flow, then fix before proceeding to unit tests.
- Scope: Run relevant test files for the feature being modified (e.g.,
tests/manual-dns-provider.spec.ts).
1. Execution Environment
- No Truncation: Never use pipe commands (e.g.,
head,tail) or flags that limit stdout/stderr. If a test hangs, it likely requires an interactive input or is caught in a loop; analyze the full output to identify the block. - Task-Based Execution: Do not manually construct test strings. Use existing project tasks (e.g.,
npm test,go test ./...). If a specific sub-module requires frequent testing, generate a new task definition in the project's configuration file (e.g.,.vscode/tasks.json) before proceeding.
2. Failure Analysis & Logic Integrity
- Evidence-Based Debugging: When a test fails, you must quote the specific error message or stack trace before suggesting a fix.
- Bug vs. Test Flaw: Treat the test as the "Source of Truth." If a test fails, assume the code is broken until proven otherwise. Research the original requirement or PR description to verify if the test logic itself is outdated before modifying it.
- Zero-Hallucination Policy: Only use file paths and identifiers discovered via the
lsorsearchtools. Never guess a path based on naming conventions.
3. Coverage & Completion
- Coverage Gate: A task is not "Complete" until a coverage report is generated.
- Threshold Compliance: You must compare the final coverage percentage against the project's threshold (Default: 85% unless specified otherwise). If coverage drops, you must identify the "uncovered lines" and add targeted tests.
- Patch Coverage (Suggestion): Codecov reports patch coverage as an indicator. While developers should aim for 100% coverage of modified lines, patch coverage is not a hard requirement and will not block PR approval. If patch coverage is low, consider adding targeted tests to improve the metric.
- Review Patch Coverage: When reviewing patch coverage reports, assess whether missing lines represent genuine gaps or are acceptable (e.g., error handling branches, deprecated code paths). Use the report to inform testing decisions, not as an absolute gate.
4. GORM Security Validation (Manual Stage)
Requirement: For any change that touches backend models or database-related logic, the GORM Security Scanner is a mandatory local DoD gate and must pass with zero CRITICAL/HIGH findings.
Policy vs. Automation Reconciliation: "Manual stage" describes execution
mechanism only (not automated pre-commit hook); policy enforcement remains
process-blocking for DoD. Gate decisions must use check semantics
(./scripts/scan-gorm-security.sh --check or equivalent task wiring).
When to Run (Conditional Trigger Matrix)
Mandatory Trigger Paths (Include):
backend/internal/models/**— GORM model definitions- Backend services/repositories with GORM query logic
- Database migrations or seeding logic affecting model persistence behavior
Explicit Exclusions:
- Docs-only changes (
**/*.md, governance documentation) - Frontend-only changes (
frontend/**)
Gate Decision Rule: IF any Include path matches, THEN scanner execution in check mode is mandatory DoD gate. IF only Exclude paths match, THEN GORM gate is not required for that change set.
Definition of Done
- Before Committing: When modifying trigger paths listed above
- Before Opening PR: Verify no security issues introduced
- After Code Review: If model-related changes were requested
- Blocking Gate: Scanner must pass with zero CRITICAL/HIGH issues before task completion
Running the Scanner
Via VS Code (Recommended for Development):
- Open Command Palette (
Cmd/Ctrl+Shift+P) - Select "Tasks: Run Task"
- Choose "Lint: GORM Security Scan"
Via Pre-commit (Manual Stage):
# Run on all Go files
pre-commit run --hook-stage manual gorm-security-scan --all-files
# Run on staged files only
pre-commit run --hook-stage manual gorm-security-scan
Direct Execution:
# Report mode - Show all issues, exit 0 (always)
./scripts/scan-gorm-security.sh --report
# Check mode - Exit 1 if issues found (use in CI)
./scripts/scan-gorm-security.sh --check
Expected Behavior
Pass (Exit Code 0):
- No security issues detected
- Proceed with commit/PR
Fail (Exit Code 1):
- Issues detected (ID leaks, exposed secrets, DTO embedding, etc.)
- Review scanner output for file:line references
- Fix issues before committing
- See GORM Security Scanner Documentation
Common Issues Detected
-
🔴 CRITICAL: ID Leak — Numeric ID with
json:"id"tag- Fix: Change to
json:"-", use UUID for external reference
- Fix: Change to
-
🔴 CRITICAL: Exposed Secret — APIKey/Token/Password with JSON tag
- Fix: Change to
json:"-"to hide sensitive field
- Fix: Change to
-
🟡 HIGH: DTO Embedding — Response struct embeds model with exposed ID
- Fix: Use explicit field definitions instead of embedding
Integration Status
Current Stage: Manual (soft launch)
- Scanner available for manual invocation
- Does not block commits automatically
- Developers should run proactively
Future Stage: Blocking (after remediation)
- Scanner will block commits with CRITICAL/HIGH issues
- CI integration will enforce on all PRs
- See GORM Scanner Roadmap
Performance
- Execution Time: ~2 seconds per full scan
- Fast enough for pre-commit use
- No impact on commit workflow when passing
Documentation
- Implementation Details: docs/implementation/gorm_security_scanner_complete.md
- Specification: docs/plans/gorm_security_scanner_spec.md
- QA Report: docs/reports/gorm_scanner_qa_report.md