diff --git a/.github/agents/QA_Security.agent.md b/.github/agents/QA_Security.agent.md index f72dd0e5..8cdcd14c 100644 --- a/.github/agents/QA_Security.agent.md +++ b/.github/agents/QA_Security.agent.md @@ -75,9 +75,14 @@ The task is not complete until ALL of the following pass with zero issues: - **Why First**: If the app is broken at E2E level, unit tests may need updates. Catch integration issues early. - **Scope**: Run tests relevant to modified features (e.g., `tests/manual-dns-provider.spec.ts`) - **On Failure**: Trace root cause through frontend → backend flow, report to Management or Dev subagent - - **Base URL**: Uses `PLAYWRIGHT_BASE_URL` or default `http://100.98.12.109:8080` - **MANDATORY**: All E2E tests must pass before proceeding + **E2E Coverage Mode** (when coverage is required): + - **Docker Mode** (`localhost:8080`): ❌ No coverage - use for quick integration testing only + - **Vite Mode** (`localhost:5173`): ✅ Real coverage - required for coverage collection + - **For Coverage**: Run `.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage` (starts Vite automatically) + - **Why**: The `@bgotink/playwright-coverage` library uses V8 coverage which requires Vite's source file access + 2. **Security Scans**: - CodeQL: Run VS Code task "Security: CodeQL All (CI-Aligned)" or individual Go/JS tasks - Trivy: Run VS Code task "Security: Trivy Scan" diff --git a/.github/agents/playwright-tester.agent.md b/.github/agents/playwright-tester.agent.md index a27d76d1..e807ba10 100644 --- a/.github/agents/playwright-tester.agent.md +++ b/.github/agents/playwright-tester.agent.md @@ -17,3 +17,31 @@ model: Claude Sonnet 4 - **No Truncation**: Never pipe Playwright test output through `head`, `tail`, or other truncating commands. Playwright runs interactively and requires user input to quit when piped, causing the command to hang indefinitely. - **Full Output**: Always capture the complete test output to analyze failures accurately. + +## E2E Coverage Collection + +**IMPORTANT**: E2E coverage ONLY works when running against the Vite dev server, NOT Docker. + +| Mode | Base URL | Coverage Support | +|------|----------|-----------------| +| Docker (`localhost:8080`) | ❌ No coverage (0% reported) | +| Vite Dev (`localhost:5173`) | ✅ Real coverage data | + +### When Coverage is Required + +Use the dedicated skill that starts Vite and collects coverage: + +```bash +# Recommended for coverage collection +.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage +``` + +### When Coverage is NOT Required + +For quick integration testing, run directly against Docker: + +```bash +npx playwright test --project=chromium +``` + +**Why?** The `@bgotink/playwright-coverage` library uses V8 coverage which requires source files only available via Vite dev server. diff --git a/.github/instructions/testing.instructions.md b/.github/instructions/testing.instructions.md index b779e930..990cfa31 100644 --- a/.github/instructions/testing.instructions.md +++ b/.github/instructions/testing.instructions.md @@ -8,10 +8,81 @@ description: 'Strict protocols for test execution, debugging, and coverage valid **MANDATORY**: Before running unit tests, verify the application functions correctly end-to-end. -* **Run Playwright E2E Tests**: Execute `npx playwright test --project=chromium` from the project root. -* **No Truncation**: Never pipe Playwright test output through `head`, `tail`, or other truncating commands. Playwright tests run interactively and require user input to quit when piped, causing the command to hang indefinitely. +### Two Modes: Docker vs Vite + +Playwright E2E tests can run in two modes with different capabilities: + +| Mode | Base URL | Coverage Support | When to Use | +|------|----------|-----------------|-------------| +| **Docker** | `http://localhost:8080` | ❌ No (0% reported) | Integration testing, CI validation | +| **Vite Dev** | `http://localhost:5173` | ✅ Yes (real coverage) | Local development, coverage collection | + +**Why?** The `@bgotink/playwright-coverage` library uses V8 coverage which requires access to source files. Only the Vite dev server exposes source maps and raw source files needed for coverage instrumentation. + +### Running E2E Tests (Integration Mode) + +For general integration testing without coverage: + +```bash +# Against Docker container (default) +npx playwright test --project=chromium + +# With explicit base URL +PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --project=chromium +``` + +### Running E2E Tests with Coverage + +**IMPORTANT**: Use the dedicated skill for coverage collection: + +```bash +# Recommended: Uses skill that starts Vite and runs against localhost:5173 +.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage +``` + +The coverage skill: +1. Starts Vite dev server on port 5173 +2. Sets `PLAYWRIGHT_BASE_URL=http://localhost:5173` +3. Runs tests with V8 coverage collection +4. Generates reports in `coverage/e2e/` (LCOV, HTML, JSON) + +**DO NOT** expect coverage when running against Docker: +```bash +# ❌ WRONG: Coverage will show "Unknown% (0/0)" +PLAYWRIGHT_BASE_URL=http://localhost:8080 npx playwright test --coverage + +# ✅ CORRECT: Use the coverage skill +.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage +``` + +### Verifying Coverage Locally Before CI + +Before pushing code, verify E2E coverage: + +1. Run the coverage skill: + ```bash + .github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage + ``` + +2. Check coverage output: + ```bash + # View HTML report + open coverage/e2e/index.html + + # Check LCOV file exists for Codecov + ls -la coverage/e2e/lcov.info + ``` + +3. Verify non-zero coverage: + ```bash + # Should show real percentages, not "0%" + head -20 coverage/e2e/lcov.info + ``` + +### General Guidelines + +* **No Truncation**: Never pipe Playwright test output through `head`, `tail`, or other truncating commands. Playwright runs interactively and requires user input to quit when piped, causing the command to hang indefinitely. * **Why First**: If the application is broken at the E2E level, unit tests may need updates. Playwright catches integration issues early. -* **Base URL**: Tests use `PLAYWRIGHT_BASE_URL` env var or default from `playwright.config.js` (Tailscale IP: `http://100.98.12.109:8080`). * **On Failure**: Analyze failures, trace root cause through frontend → backend flow, then fix before proceeding to unit tests. * **Scope**: Run relevant test files for the feature being modified (e.g., `tests/manual-dns-provider.spec.ts`). diff --git a/.github/skills/test-e2e-playwright-coverage.SKILL.md b/.github/skills/test-e2e-playwright-coverage.SKILL.md index c7d08c35..2c610971 100644 --- a/.github/skills/test-e2e-playwright-coverage.SKILL.md +++ b/.github/skills/test-e2e-playwright-coverage.SKILL.md @@ -72,6 +72,13 @@ metadata: Runs Playwright end-to-end tests with code coverage collection using `@bgotink/playwright-coverage`. This skill collects V8 coverage data during test execution and generates reports in LCOV, HTML, and JSON formats suitable for upload to Codecov. +**IMPORTANT**: This skill starts the **Vite dev server** (not Docker) because V8 coverage requires access to source files. Running coverage against the Docker container will result in `0%` coverage. + +| Mode | Base URL | Coverage Support | +|------|----------|-----------------| +| Docker (`localhost:8080`) | ❌ No - Shows "Unknown% (0/0)" | +| Vite Dev (`localhost:5173`) | ✅ Yes - Real coverage data | + ## Prerequisites - Node.js 18.0 or higher installed and in PATH diff --git a/CHANGELOG.md b/CHANGELOG.md index a9f2f99b..f4656a3a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added + +- **Phase 4: Security Module Toggle Actions**: Security dashboard toggles for ACL, WAF, and Rate Limiting are now fully functional (PR #XXX) + - **Toggle Functionality**: Enable/disable security modules directly from the Security Dashboard UI + - **Backend Cache Layer**: 60-second TTL in-memory cache for settings to minimize database queries in middleware + - **Auto Config Reload**: Caddy configuration automatically reloads when security settings change + - **Optimistic Updates**: Toggle changes reflect instantly in the UI with proper rollback on failure + - **Mode Preservation**: WAF and Rate Limiting mode settings (detection/prevention, log/block) preserved during toggles + - **8 E2E Tests Enabled**: Previously skipped security dashboard tests now pass + - See [Phase 4 Specification](docs/plans/phase4_security_toggles_spec.md) for implementation details + ### Security - **CRITICAL**: Fixed CVE-2025-68156 by upgrading expr-lang/expr to v1.17.7 diff --git a/backend/internal/api/handlers/settings_handler.go b/backend/internal/api/handlers/settings_handler.go index 6354d1f8..307abf1b 100644 --- a/backend/internal/api/handlers/settings_handler.go +++ b/backend/internal/api/handlers/settings_handler.go @@ -1,20 +1,36 @@ package handlers import ( + "context" "net/http" + "strings" + "time" "github.com/gin-gonic/gin" "gorm.io/gorm" + "github.com/Wikid82/charon/backend/internal/logger" "github.com/Wikid82/charon/backend/internal/models" "github.com/Wikid82/charon/backend/internal/security" "github.com/Wikid82/charon/backend/internal/services" "github.com/Wikid82/charon/backend/internal/utils" ) +// CaddyConfigManager interface for triggering Caddy config reload +type CaddyConfigManager interface { + ApplyConfig(ctx context.Context) error +} + +// CacheInvalidator interface for invalidating security settings cache +type CacheInvalidator interface { + InvalidateCache() +} + type SettingsHandler struct { - DB *gorm.DB - MailService *services.MailService + DB *gorm.DB + MailService *services.MailService + CaddyManager CaddyConfigManager // For triggering config reload on security settings change + Cerberus CacheInvalidator // For invalidating cache on security settings change } func NewSettingsHandler(db *gorm.DB) *SettingsHandler { @@ -24,6 +40,16 @@ func NewSettingsHandler(db *gorm.DB) *SettingsHandler { } } +// NewSettingsHandlerWithDeps creates a SettingsHandler with all dependencies for config reload +func NewSettingsHandlerWithDeps(db *gorm.DB, caddyMgr CaddyConfigManager, cerberus CacheInvalidator) *SettingsHandler { + return &SettingsHandler{ + DB: db, + MailService: services.NewMailService(db), + CaddyManager: caddyMgr, + Cerberus: cerberus, + } +} + // GetSettings returns all settings. func (h *SettingsHandler) GetSettings(c *gin.Context) { var settings []models.Setting @@ -74,6 +100,28 @@ func (h *SettingsHandler) UpdateSetting(c *gin.Context) { return } + // Trigger cache invalidation and config reload for security settings + if strings.HasPrefix(req.Key, "security.") { + // Invalidate Cerberus cache immediately so middleware uses new settings + if h.Cerberus != nil { + h.Cerberus.InvalidateCache() + } + + // Trigger async Caddy config reload (doesn't block HTTP response) + if h.CaddyManager != nil { + go func() { + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + + if err := h.CaddyManager.ApplyConfig(ctx); err != nil { + logger.Log().WithError(err).Warn("Failed to reload Caddy config after security setting change") + } else { + logger.Log().WithField("setting_key", req.Key).Info("Caddy config reloaded after security setting change") + } + }() + } + } + c.JSON(http.StatusOK, setting) } diff --git a/backend/internal/api/routes/routes.go b/backend/internal/api/routes/routes.go index 26fc9317..fbc5495d 100644 --- a/backend/internal/api/routes/routes.go +++ b/backend/internal/api/routes/routes.go @@ -107,8 +107,9 @@ func Register(router *gin.Engine, db *gorm.DB, cfg config.Config) error { cerb := cerberus.New(cfg.Security, db) api.Use(cerb.Middleware()) - // Caddy Manager declaration so it can be used across the entire Register function - var caddyManager *caddy.Manager + // Caddy Manager - created early so it can be used by settings handlers for config reload + caddyClient := caddy.NewClient(cfg.CaddyAdminAPI) + caddyManager := caddy.NewManager(caddyClient, db, cfg.CaddyConfigDir, cfg.FrontendDir, cfg.ACMEStaging, cfg.Security) // Auth routes authService := services.NewAuthService(db, cfg) @@ -194,8 +195,8 @@ func Register(router *gin.Engine, db *gorm.DB, cfg config.Config) error { protected.GET("/audit-logs", auditLogHandler.List) protected.GET("/audit-logs/:uuid", auditLogHandler.Get) - // Settings - settingsHandler := handlers.NewSettingsHandler(db) + // Settings - with CaddyManager and Cerberus for security settings reload + settingsHandler := handlers.NewSettingsHandlerWithDeps(db, caddyManager, cerb) protected.GET("/settings", settingsHandler.GetSettings) protected.POST("/settings", settingsHandler.UpdateSetting) @@ -411,9 +412,7 @@ func Register(router *gin.Engine, db *gorm.DB, cfg config.Config) error { c.JSON(200, gin.H{"message": "Uptime check started"}) }) - // Caddy Manager - caddyClient := caddy.NewClient(cfg.CaddyAdminAPI) - caddyManager = caddy.NewManager(caddyClient, db, cfg.CaddyConfigDir, cfg.FrontendDir, cfg.ACMEStaging, cfg.Security) + // caddyManager is already created early in Register() for use by settingsHandler // Initialize GeoIP service if database exists geoipPath := os.Getenv("CHARON_GEOIP_DB_PATH") diff --git a/backend/internal/caddy/manager.go b/backend/internal/caddy/manager.go index c933ca0b..fba937f7 100644 --- a/backend/internal/caddy/manager.go +++ b/backend/internal/caddy/manager.go @@ -639,6 +639,26 @@ func (m *Manager) computeEffectiveFlags(_ context.Context) (cerbEnabled, aclEnab } } + // runtime override for WAF enabled + s = models.Setting{} // Reset + if err := m.db.Where("key = ?", "security.waf.enabled").First(&s).Error; err == nil { + if strings.EqualFold(s.Value, "true") { + wafEnabled = true + } else if strings.EqualFold(s.Value, "false") { + wafEnabled = false + } + } + + // runtime override for Rate Limit enabled + s = models.Setting{} // Reset + if err := m.db.Where("key = ?", "security.rate_limit.enabled").First(&s).Error; err == nil { + if strings.EqualFold(s.Value, "true") { + rateLimitEnabled = true + } else if strings.EqualFold(s.Value, "false") { + rateLimitEnabled = false + } + } + // runtime override for crowdsec mode (mode value determines whether it's local/remote/enabled) var cm struct{ Value string } if err := m.db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", "security.crowdsec.mode").Scan(&cm).Error; err == nil && cm.Value != "" { diff --git a/backend/internal/cerberus/cerberus.go b/backend/internal/cerberus/cerberus.go index 13da454d..27b846c4 100644 --- a/backend/internal/cerberus/cerberus.go +++ b/backend/internal/cerberus/cerberus.go @@ -5,6 +5,7 @@ import ( "context" "net/http" "strings" + "sync" "time" "github.com/gin-gonic/gin" @@ -23,6 +24,12 @@ type Cerberus struct { db *gorm.DB accessSvc *services.AccessListService securityNotifySvc *services.SecurityNotificationService + + // Settings cache for performance - avoids DB queries on every request + settingsCache map[string]string + settingsCacheMu sync.RWMutex + settingsCacheTime time.Time + settingsCacheTTL time.Duration } // New creates a new Cerberus instance @@ -32,9 +39,63 @@ func New(cfg config.SecurityConfig, db *gorm.DB) *Cerberus { db: db, accessSvc: services.NewAccessListService(db), securityNotifySvc: services.NewSecurityNotificationService(db), + settingsCache: make(map[string]string), + settingsCacheTTL: 60 * time.Second, } } +// getSetting retrieves a setting with in-memory caching. +// Returns the value and a boolean indicating if the key was found. +func (c *Cerberus) getSetting(key string) (string, bool) { + if c.db == nil { + return "", false + } + + // Fast path: check cache with read lock + c.settingsCacheMu.RLock() + if time.Since(c.settingsCacheTime) < c.settingsCacheTTL { + val, ok := c.settingsCache[key] + c.settingsCacheMu.RUnlock() + return val, ok + } + c.settingsCacheMu.RUnlock() + + // Slow path: refresh cache with write lock + c.settingsCacheMu.Lock() + defer c.settingsCacheMu.Unlock() + + // Double-check: another goroutine might have refreshed cache + if time.Since(c.settingsCacheTime) < c.settingsCacheTTL { + val, ok := c.settingsCache[key] + return val, ok + } + + // Refresh entire cache from DB (batch query is faster than individual queries) + var settings []models.Setting + if err := c.db.Where("key LIKE ?", "security.%").Find(&settings).Error; err != nil { + logger.Log().WithError(err).Debug("Failed to refresh settings cache") + return "", false + } + + // Update cache + c.settingsCache = make(map[string]string) + for _, s := range settings { + c.settingsCache[s.Key] = s.Value + } + c.settingsCacheTime = time.Now() + + val, ok := c.settingsCache[key] + return val, ok +} + +// InvalidateCache forces cache refresh on next access. +// Call this after updating security settings. +func (c *Cerberus) InvalidateCache() { + c.settingsCacheMu.Lock() + c.settingsCacheTime = time.Time{} // Zero time forces refresh + c.settingsCacheMu.Unlock() +} + // IsEnabled returns whether Cerberus features are enabled via config or settings. func (c *Cerberus) IsEnabled() bool { // DB-backed break-glass disable must take effect even when static config defaults to enabled. @@ -89,16 +150,25 @@ func (c *Cerberus) Middleware() gin.HandlerFunc { // WAF: The actual WAF protection is handled by the Coraza plugin at the Caddy layer. // This middleware just tracks metrics for requests when WAF is enabled. - // The naive