Files
Charon/docs/plans/backend_coverage_fix_plan.md
GitHub Actions 29d2ec9cbf fix(ci): resolve E2E workflow failures and boost test coverage
E2E Workflow Fixes:

Add frontend dependency installation step (missing npm ci in frontend/)
Remove incorrect working-directory from backend build step
Update Node.js version from v18 to v20 (dependency requirements)
Backend Coverage: 84.9% → 85.0% (20+ new test functions):

Access list service validation and templates
Backup service error handling and edge cases
Security audit logs and rule sets
Auth service edge cases and token validation
Certificate service upload and sync error paths
Frontend Coverage: 85.06% → 85.66% (27 new tests):

Tabs component accessibility and keyboard navigation
Plugins page status badges and error handling
SecurityHeaders CRUD operations and presets
API wrappers for credentials and encryption endpoints
E2E Infrastructure:

Enhanced global-setup with emergency security module reset
Added retry logic and verification for settings propagation
Known Issues:

19 E2E tests still failing (ACL blocking security APIs - Issue #16)
7 Plugins modal UI tests failing (non-critical)
To be addressed in follow-up PR
Fixes #550 E2E workflow failures
Related to #16 ACL implementation
2026-01-26 04:09:57 +00:00

14 KiB

Backend Coverage Recovery Plan

Status: 🔴 CRITICAL - Coverage at 84.9% (Threshold: 85%) Created: 2026-01-26 Priority: IMMEDIATE


Executive Summary

Root Cause Analysis

Backend coverage dropped to 84.9% (0.1% below threshold) due to:

  1. cmd/seed package: 68.2% coverage (295 lines, main function hard to test)
  2. services package: 82.4% average (73 functions below 85% threshold)
  3. utils package: 74.2% coverage
  4. builtin DNS providers: 30.4% coverage (test coverage gap)

Impact Assessment

  • Severity: Low (0.1% below threshold, ~10-15 uncovered statements)
  • Cause: Recent development branch merge brought in new features:
    • Break-glass security reset (892b89fc)
    • Cerberus enabled by default (1ac3e5a4)
    • User management UI features
    • CrowdSec resilience improvements

Fastest Path to 85%

Option A (RECOMMENDED): Target 10 critical service functions → 85.2% in 1-2 hours Option B: Add cmd/seed integration tests → 85.5% in 3-4 hours Option C: Comprehensive service coverage → 86%+ in 4-6 hours


Option A: Surgical Service Function Coverage (FASTEST)

Strategy

Target the top 10 lowest-coverage service functions that are:

  • Actually executed in production (not just error paths)
  • Easy to test (no complex mocking)
  • High statement count (max coverage gain per test)

Target Functions (Prioritized by Impact)

Phase 1: Critical Service Functions (30-45 min)

  1. access_list_service.go:103 - GetByID (83.3% → 100%)

    // Add test: TestAccessListService_GetByID_NotFound
    // Add test: TestAccessListService_GetByID_Success
    

    Lines: 8 statements | Effort: 15 min | Gain: +0.05%

  2. access_list_service.go:115 - GetByUUID (83.3% → 100%)

    // Add test: TestAccessListService_GetByUUID_NotFound
    // Add test: TestAccessListService_GetByUUID_Success
    

    Lines: 8 statements | Effort: 15 min | Gain: +0.05%

  3. auth_service.go:30 - Register (83.3% → 100%)

    // Add test: TestAuthService_Register_ValidationError
    // Add test: TestAuthService_Register_DuplicateEmail
    

    Lines: 8 statements | Effort: 15 min | Gain: +0.05%

Phase 2: Medium Impact Functions (30-45 min)

  1. backup_service.go:217 - addToZip (76.9% → 95%)

    // Add test: TestBackupService_AddToZip_FileError
    // Add test: TestBackupService_AddToZip_Success
    

    Lines: 7 statements | Effort: 20 min | Gain: +0.04%

  2. backup_service.go:304 - unzip (71.0% → 95%)

    // Add test: TestBackupService_Unzip_InvalidZip
    // Add test: TestBackupService_Unzip_PathTraversal
    

    Lines: 7 statements | Effort: 20 min | Gain: +0.04%

  3. certificate_service.go:49 - NewCertificateService (0% → 100%)

    // Add test: TestNewCertificateService_Initialization
    

    Lines: 8 statements | Effort: 10 min | Gain: +0.05%

Phase 3: Quick Wins (20-30 min)

  1. access_list_service.go:233 - testGeoIP (9.1% → 90%)

    // Add test: TestAccessList_TestGeoIP_AllowedCountry
    // Add test: TestAccessList_TestGeoIP_BlockedCountry
    

    Lines: 9 statements | Effort: 15 min | Gain: +0.05%

  2. backup_service.go:363 - GetAvailableSpace (78.6% → 100%)

    // Add test: TestBackupService_GetAvailableSpace_Error
    

    Lines: 7 statements | Effort: 10 min | Gain: +0.04%

  3. access_list_service.go:127 - List (75.0% → 95%)

    // Add test: TestAccessListService_List_Pagination
    

    Lines: 7 statements | Effort: 10 min | Gain: +0.04%

  4. access_list_service.go:159 - Delete (71.8% → 95%)

    // Add test: TestAccessListService_Delete_NotFound
    

    Lines: 8 statements | Effort: 10 min | Gain: +0.05%

Total Impact: Option A

  • Coverage Gain: +0.46% (84.9% → 85.36%)
  • Total Time: 1h 45min - 2h 30min
  • Tests Added: ~15-18 test cases
  • Files Modified: 4-5 test files

Success Criteria: Backend coverage ≥ 85.2%


Option B: cmd/seed Integration Tests (MODERATE)

Strategy

Add integration-style tests for the seed command to cover the main function logic.

Implementation

File: backend/cmd/seed/main_integration_test.go

//go:build integration

package main

import (
    "os"
    "testing"
    "path/filepath"
)

func TestSeedCommand_FullExecution(t *testing.T) {
    // Setup temp database
    tmpDir := t.TempDir()
    dbPath := filepath.Join(tmpDir, "test.db")

    // Set environment
    os.Setenv("CHARON_DB_PATH", dbPath)
    defer os.Unsetenv("CHARON_DB_PATH")

    // Run seed (need to refactor main() into runSeed() first)
    // Test that all seed data is created
}

func TestLogSeedResult_AllCases(t *testing.T) {
    // Test success case
    // Test error case
    // Test already exists case
}

Refactoring Required

// main.go - Extract testable function
func runSeed(dbPath string) error {
    // Move main() logic here
    // Return error instead of log.Fatal
}

func main() {
    if err := runSeed("./data/charon.db"); err != nil {
        log.Fatal(err)
    }
}

Total Impact: Option B

  • Coverage Gain: +0.6% (84.9% → 85.5%)
  • Total Time: 3-4 hours (includes refactoring)
  • Tests Added: 3-5 integration tests
  • Files Modified: 2 files (main.go + main_integration_test.go)
  • Risk: Medium (requires refactoring production code)

Option C: Comprehensive Service Coverage (THOROUGH)

Strategy

Systematically increase all service package functions to ≥85% coverage.

Scope

  • 73 functions currently below 85%
  • Average coverage increase: 10-15% per function
  • Focus on:
    • Error path coverage
    • Edge case handling
    • Validation logic

Total Impact: Option C

  • Coverage Gain: +1.1% (84.9% → 86.0%)
  • Total Time: 6-8 hours
  • Tests Added: 80-100 test cases
  • Files Modified: 15-20 test files

Recommendation: Option A

Rationale

  1. Fastest to 85%: 1h 45min - 2h 30min
  2. Low Risk: No production code changes
  3. High ROI: 0.46% coverage gain with minimal tests
  4. Debuggable: Small, focused changes easy to review
  5. Maintainable: Tests follow existing patterns

Implementation Order

# Phase 1: Critical Functions (30-45 min)
1. backend/internal/services/access_list_service_test.go
   - Add GetByID tests
   - Add GetByUUID tests
2. backend/internal/services/auth_service_test.go
   - Add Register validation tests

# Phase 2: Medium Impact (30-45 min)
3. backend/internal/services/backup_service_test.go
   - Add addToZip tests
   - Add unzip tests
4. backend/internal/services/certificate_service_test.go
   - Add NewCertificateService test

# Phase 3: Quick Wins (20-30 min)
5. backend/internal/services/access_list_service_test.go
   - Add testGeoIP tests
   - Add List pagination test
   - Add Delete NotFound test
6. backend/internal/services/backup_service_test.go
   - Add GetAvailableSpace test

# Validation (10 min)
7. Run: .github/skills/scripts/skill-runner.sh test-backend-coverage
8. Verify: Coverage ≥ 85.2%
9. Commit and push

E2E ACL Fix Plan (Separate Issue)

Current State

  • global-setup.ts already has emergencySecurityReset()
  • docker-compose.e2e.yml has CHARON_EMERGENCY_TOKEN set
  • Tests should NOT be blocked by ACL

Issue Diagnosis

The emergency reset is working, but:

  1. Some tests may be enabling ACL during execution
  2. Cleanup may not be running if test crashes
  3. Emergency token may need verification

Fix Strategy (15-20 min)

// tests/global-setup.ts - Enhance emergency reset
async function emergencySecurityReset(requestContext: APIRequestContext): Promise<void> {
  console.log('🚨 Emergency security reset...');

  // Try with emergency token header first
  const emergencyToken = process.env.CHARON_EMERGENCY_TOKEN || 'test-emergency-token-for-e2e-32chars';

  const modules = [
    { key: 'security.acl.enabled', value: 'false' },
    { key: 'security.waf.enabled', value: 'false' },
    { key: 'security.crowdsec.enabled', value: 'false' },
    { key: 'security.rate_limit.enabled', value: 'false' },
    { key: 'feature.cerberus.enabled', value: 'false' },
  ];

  for (const { key, value } of modules) {
    try {
      // Try with emergency token
      await requestContext.post('/api/v1/settings', {
        data: { key, value },
        headers: { 'X-Emergency-Token': emergencyToken },
      });
      console.log(`  ✓ Disabled: ${key}`);
    } catch (e) {
      // Try without token (for backwards compatibility)
      try {
        await requestContext.post('/api/v1/settings', { data: { key, value } });
        console.log(`  ✓ Disabled: ${key} (no token)`);
      } catch (e2) {
        console.log(`  ⚠ Could not disable ${key}: ${e2}`);
      }
    }
  }
}

Verification Steps

  1. Test emergency reset: Run E2E tests with ACL enabled manually
  2. Check token: Verify emergency token is being passed correctly
  3. Add debug logs: Confirm reset is executing before tests

Estimated Time: 15-20 minutes


Frontend Plugins Test Decision

Current State

  • Working: __tests__/Plugins.test.tsx (312 lines, 18 tests)
  • Skip: Plugins.test.tsx.skip (710 lines, 34 tests)
  • Coverage: Plugins.tsx @ 56.6% (working tests)

Analysis

Metric Working Tests Skip File Delta
Lines of Code 312 710 +398 (128% more)
Test Count 18 34 +16 (89% more)
Current Coverage 56.6% Unknown ?
Mocking Complexity Low High Complex setup

Recommendation: KEEP WORKING TESTS

Rationale:

  1. Coverage Gain Unknown: Skip file may only add 5-10% coverage (20-30 statements)
  2. High Risk: 710 lines of complex mocking to debug (1-2 hours minimum)
  3. Diminishing Returns: 18 tests already cover critical paths
  4. Frontend Plan Exists: Current plan targets 86.5% without Plugins fixes

Alternative: Hybrid Approach (If Needed)

If frontend falls short of 86.5% after current plan:

  1. Extract 5-6 tests from skip file (highest value, lowest mock complexity)
  2. Focus on: Error path coverage, edge cases
  3. Estimated Gain: +3-5% coverage on Plugins.tsx
  4. Time: 30-45 minutes

Recommendation: Only pursue if frontend coverage < 85.5% after Phase 3


Complete Implementation Timeline

Phase 1: Backend Critical Functions (45 min)

  • access_list_service: GetByID, GetByUUID (30 min)
  • auth_service: Register validation (15 min)
  • Checkpoint: Run tests, verify +0.15%

Phase 2: Backend Medium Impact (45 min)

  • backup_service: addToZip, unzip (40 min)
  • certificate_service: NewCertificateService (5 min)
  • Checkpoint: Run tests, verify +0.13%

Phase 3: Backend Quick Wins (30 min)

  • access_list_service: testGeoIP, List, Delete (20 min)
  • backup_service: GetAvailableSpace (10 min)
  • Checkpoint: Run tests, verify +0.18%

Phase 4: E2E Fix (20 min)

  • Enhance emergency reset with token support (15 min)
  • Verify with manual ACL test (5 min)

Phase 5: Validation & CI (15 min)

  • Run full backend test suite with coverage
  • Verify coverage ≥ 85.2%
  • Commit and push
  • Monitor CI for green build

Total Timeline: 2h 35min

Breakdown:

  • Backend tests: 2h 0min
  • E2E fix: 20 min
  • Validation: 15 min

Success Criteria & DoD

Backend Coverage

  • Overall coverage ≥ 85.2%
  • All service functions in target list ≥ 85%
  • No new coverage regressions
  • All tests pass with zero failures

E2E Tests

  • Emergency reset executes successfully
  • No ACL blocking issues during test runs
  • All E2E tests pass (chromium)

CI/CD

  • Backend coverage check passes (≥85%)
  • Frontend coverage check passes (≥85%)
  • E2E tests pass
  • All linting passes
  • Security scans pass

Risk Assessment

Low Risk

  • Service test additions: Following existing patterns
  • Test-only changes: No production code modified
  • Emergency reset enhancement: Backwards compatible

Medium Risk

  • cmd/seed refactoring (Option B only): Requires production code changes

Mitigation

  • Start with Option A (low risk, fast)
  • Only pursue Option B/C if Option A insufficient
  • Run tests after each phase (fail fast)

Appendix: Coverage Analysis Details

Current Backend Test Statistics

Test Files: 215
Source Files: 164
Test:Source Ratio: 1.31:1 ✅ (healthy)
Total Coverage: 84.9%

Package Breakdown

Package Coverage Status Priority
handlers 85.7% Pass -
routes 87.5% Pass -
middleware 99.1% Pass -
services 82.4% ⚠️ Fail HIGH
utils 74.2% ⚠️ Fail MEDIUM
cmd/seed 68.2% ⚠️ Fail LOW
builtin 30.4% ⚠️ Fail MEDIUM
caddy 97.8% Pass -
cerberus 83.8% ⚠️ Borderline LOW
crowdsec 85.2% Pass -
database 91.3% Pass -
models 96.8% Pass -

Weighted Coverage Calculation

Total Statements: ~15,000
Covered Statements: ~12,735
Uncovered Statements: ~2,265

To reach 85%: Need +15 statements covered (0.1% gap)
To reach 86%: Need +165 statements covered (1.1% gap)

Next Actions

Immediate (You):

  1. Review and approve this plan
  2. Choose option (A recommended)
  3. Authorize implementation start

Implementation (Agent):

  1. Execute Plan Option A (Phases 1-3)
  2. Execute E2E fix
  3. Validate and commit
  4. Monitor CI

Timeline: Start → Finish = 2h 35min