diff --git a/docs/plans/crowdsec_toggle_fix_plan.md b/docs/plans/crowdsec_toggle_fix_plan.md new file mode 100644 index 00000000..117264a3 --- /dev/null +++ b/docs/plans/crowdsec_toggle_fix_plan.md @@ -0,0 +1,1005 @@ +# CrowdSec Toggle Integration Fix Plan + +**Date**: December 15, 2025 +**Issue**: CrowdSec toggle stuck ON, reconciliation silently exits, process not starting +**Root Cause**: Database disconnect between frontend (Settings table) and reconciliation (SecurityConfig table) + +--- + +## Executive Summary + +The CrowdSec toggle shows "ON" but the process is NOT running. The reconciliation function silently exits without starting CrowdSec because: + +1. **Frontend writes to Settings table** (`security.crowdsec.enabled`) +2. **Backend reconciliation reads from SecurityConfig table** (`crowdsec_mode = "local"`) +3. **No synchronization** between the two tables +4. **Auto-initialization code EXISTS** (lines 46-71 in crowdsec_startup.go) but creates config with `crowdsec_mode = "disabled"` +5. **Reconciliation sees "disabled"** and exits silently with no logs + +--- + +## Root Cause Analysis (DETAILED) + +### Evidence Trail + +**Container Logs Show Silent Exit**: +``` +{"bin_path":"crowdsec","data_dir":"/app/data/crowdsec","level":"info","msg":"CrowdSec reconciliation: starting startup check","time":"2025-12-14T23:32:33-05:00"} +[NO FURTHER LOGS - Function exited here] +``` + +**Database State on Fresh Start**: +``` +SELECT * FROM security_configs → record not found +{"level":"info","msg":"CrowdSec reconciliation: no SecurityConfig found, creating default config"} +``` + +**Process Check**: +```bash +$ docker exec charon ps aux | grep -i crowdsec +[NO RESULTS - Process not running] +``` + +### Why Reconciliation Exits Silently + +**FILE**: `backend/internal/services/crowdsec_startup.go` + +**Execution Flow**: +``` +1. User clicks toggle ON in Security.tsx +2. Frontend calls updateSetting('security.crowdsec.enabled', 'true') +3. Settings table updated → security.crowdsec.enabled = "true" +4. Frontend calls startCrowdsec() → Handler updates SecurityConfig +5. CrowdSec starts successfully, toggle shows ON +6. Container restarts (docker restart or reboot) +7. ReconcileCrowdSecOnStartup() executes at line 26: + + Line 44: db.First(&cfg) → returns gorm.ErrRecordNotFound + + Lines 46-71: Auto-initialization block executes: + - Creates SecurityConfig with crowdsec_mode = "disabled" + - Logs "default SecurityConfig created successfully" + - Returns early (line 70) WITHOUT checking Settings table + - CrowdSec is NEVER started + + Result: Toggle shows "ON" (Settings table), but process is "OFF" (not running) +``` + +**THE BUG (Lines 46-71)**: +```go +if err == gorm.ErrRecordNotFound { + // AUTO-INITIALIZE: Create default SecurityConfig on first startup + logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, creating default config") + + defaultCfg := models.SecurityConfig{ + UUID: "default", + Name: "Default Security Config", + Enabled: false, + CrowdSecMode: "disabled", // ← PROBLEM: Ignores Settings table state + WAFMode: "disabled", + WAFParanoiaLevel: 1, + RateLimitMode: "disabled", + RateLimitBurst: 10, + RateLimitRequests: 100, + RateLimitWindowSec: 60, + } + + if err := db.Create(&defaultCfg).Error; err != nil { + logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") + return + } + + logger.Log().Info("CrowdSec reconciliation: default SecurityConfig created successfully") + // Don't start CrowdSec on fresh install - user must enable via UI + return // ← EXITS WITHOUT checking Settings table or starting process +} +``` + +**Why This Causes the Issue**: + +1. **First Container Start**: User enables CrowdSec via toggle + - Settings: `security.crowdsec.enabled = "true"` ✅ + - SecurityConfig: `crowdsec_mode = "local"` ✅ (via Start handler) + - Process: Running ✅ + +2. **Container Restart**: Database persists but SecurityConfig table may be empty (migration issue or corruption) + - Reconciliation runs + - SecurityConfig table: **EMPTY** (record lost or never migrated) + - Auto-init creates SecurityConfig with `crowdsec_mode = "disabled"` + - Returns early without checking Settings table + - Settings: Still shows `"true"` (UI says ON) + - SecurityConfig: Says `"disabled"` (reconciliation source) + - Process: NOT started ❌ + +3. **Result**: **State Mismatch** + - Frontend toggle: **ON** (reads Settings table) + - Backend reconciliation: **OFF** (reads SecurityConfig table) + - Process: **NOT RUNNING** (reconciliation didn't start it) + +--- + +## Current Code Analysis + +### 1. Reconciliation Function (crowdsec_startup.go) + +**Location**: `backend/internal/services/crowdsec_startup.go` + +**Lines 44-71 (Auto-initialization - THE BUG)**: +```go +var cfg models.SecurityConfig +if err := db.First(&cfg).Error; err != nil { + if err == gorm.ErrRecordNotFound { + // AUTO-INITIALIZE: Create default SecurityConfig on first startup + logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, creating default config") + + defaultCfg := models.SecurityConfig{ + UUID: "default", + Name: "Default Security Config", + Enabled: false, + CrowdSecMode: "disabled", // ← IGNORES Settings table + WAFMode: "disabled", + WAFParanoiaLevel: 1, + RateLimitMode: "disabled", + RateLimitBurst: 10, + RateLimitRequests: 100, + RateLimitWindowSec: 60, + } + + if err := db.Create(&defaultCfg).Error; err != nil { + logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") + return + } + + logger.Log().Info("CrowdSec reconciliation: default SecurityConfig created successfully") + // Don't start CrowdSec on fresh install - user must enable via UI + return // ← EARLY EXIT - Never checks Settings table + } + logger.Log().WithError(err).Warn("CrowdSec reconciliation: failed to read SecurityConfig") + return +} +``` + +**Lines 74-90 (Runtime Setting Override - UNREACHABLE after auto-init)**: +```go +// Also check for runtime setting override in settings table +var settingOverride struct{ Value string } +crowdSecEnabled := false +if err := db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", "security.crowdsec.enabled").Scan(&settingOverride).Error; err == nil && settingOverride.Value != "" { + crowdSecEnabled = strings.EqualFold(settingOverride.Value, "true") + logger.Log().WithFields(map[string]interface{}{ + "setting_value": settingOverride.Value, + "crowdsec_enabled": crowdSecEnabled, + }).Debug("CrowdSec reconciliation: found runtime setting override") +} +``` + +**This code is NEVER REACHED** when SecurityConfig doesn't exist because line 70 returns early! + +**Lines 91-98 (Decision Logic)**: +```go +// Only auto-start if CrowdSecMode is "local" OR runtime setting is enabled +if cfg.CrowdSecMode != "local" && !crowdSecEnabled { + logger.Log().WithFields(map[string]interface{}{ + "db_mode": cfg.CrowdSecMode, + "setting_enabled": crowdSecEnabled, + }).Debug("CrowdSec reconciliation skipped: mode is not 'local' and setting not enabled") + return +} +``` + +**Also UNREACHABLE** during auto-init scenario! + +### 2. Start Handler (crowdsec_handler.go) + +**Location**: `backend/internal/api/handlers/crowdsec_handler.go` + +**Lines 167-192 - CORRECT IMPLEMENTATION**: +```go +func (h *CrowdsecHandler) Start(c *gin.Context) { + ctx := c.Request.Context() + + // UPDATE SecurityConfig to persist user's intent + var cfg models.SecurityConfig + if err := h.DB.First(&cfg).Error; err != nil { + if err == gorm.ErrRecordNotFound { + // Create default config with CrowdSec enabled + cfg = models.SecurityConfig{ + UUID: "default", + Name: "Default Security Config", + Enabled: true, + CrowdSecMode: "local", // ← CORRECT: Sets mode to "local" + } + if err := h.DB.Create(&cfg).Error; err != nil { + logger.Log().WithError(err).Error("Failed to create SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to persist configuration"}) + return + } + } else { + logger.Log().WithError(err).Error("Failed to read SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to read configuration"}) + return + } + } else { + // Update existing config + cfg.CrowdSecMode = "local" + cfg.Enabled = true + if err := h.DB.Save(&cfg).Error; err != nil { + logger.Log().WithError(err).Error("Failed to update SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to persist configuration"}) + return + } + } + + // Start the process... +} +``` + +**Analysis**: This is CORRECT. The Start handler properly updates SecurityConfig when user clicks "Start" from the CrowdSec config page (/security/crowdsec). + +### 3. Frontend Toggle (Security.tsx) + +**Location**: `frontend/src/pages/Security.tsx` + +**Lines 64-120 - THE DISCONNECT**: +```tsx +const crowdsecPowerMutation = useMutation({ + mutationFn: async (enabled: boolean) => { + // Step 1: Update Settings table + await updateSetting('security.crowdsec.enabled', enabled ? 'true' : 'false', 'security', 'bool') + + if (enabled) { + // Step 2: Call Start() which updates SecurityConfig + const result = await startCrowdsec() + + // Step 3: Verify running + const status = await statusCrowdsec() + if (!status.running) { + await updateSetting('security.crowdsec.enabled', 'false', 'security', 'bool') + throw new Error('CrowdSec process failed to start') + } + + return result + } else { + // Step 2: Call Stop() which DOES NOT update SecurityConfig! + await stopCrowdsec() + + // Step 3: Verify stopped + await new Promise(resolve => setTimeout(resolve, 500)) + const status = await statusCrowdsec() + if (status.running) { + throw new Error('CrowdSec process still running') + } + + return { enabled: false } + } + }, +}) +``` + +**Analysis**: +- **Enable Path**: Updates Settings → Calls Start() → Start() updates SecurityConfig → ✅ Both tables synced +- **Disable Path**: Updates Settings → Calls Stop() → Stop() **does NOT always update SecurityConfig** → ❌ Tables out of sync + +Looking at the Stop handler: +```go +func (h *CrowdsecHandler) Stop(c *gin.Context) { + ctx := c.Request.Context() + if err := h.Executor.Stop(ctx, h.DataDir); err != nil { + c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) + return + } + + // UPDATE SecurityConfig to persist user's intent + var cfg models.SecurityConfig + if err := h.DB.First(&cfg).Error; err == nil { + cfg.CrowdSecMode = "disabled" + cfg.Enabled = false + if err := h.DB.Save(&cfg).Error; err != nil { + logger.Log().WithError(err).Warn("Failed to update SecurityConfig after stopping CrowdSec") + } + } + + c.JSON(http.StatusOK, gin.H{"status": "stopped"}) +} +``` + +**This IS CORRECT** - Stop() handler updates SecurityConfig when it can find it. BUT: + +**Scenario Where It Fails**: +1. SecurityConfig table gets corrupted/cleared/migrated incorrectly +2. User clicks toggle OFF +3. Stop() tries to update SecurityConfig → record not found → skips update +4. Settings table still updated to "false" +5. Container restarts → auto-init creates SecurityConfig with "disabled" +6. Both tables say "disabled" but UI might show stale state + +--- + +## Comprehensive Fix Strategy + +### Phase 1: Fix Auto-Initialization (CRITICAL - IMMEDIATE) + +**FILE**: `backend/internal/services/crowdsec_startup.go` + +**CHANGE**: Lines 46-71 (auto-initialization block) + +**AFTER** (with Settings table check): +```go +if err == gorm.ErrRecordNotFound { + // AUTO-INITIALIZE: Create default SecurityConfig by checking Settings table + logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, checking Settings table for user preference") + + // Check if user has already enabled CrowdSec via Settings table (from toggle or legacy config) + var settingOverride struct{ Value string } + crowdSecEnabledInSettings := false + if err := db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", "security.crowdsec.enabled").Scan(&settingOverride).Error; err == nil && settingOverride.Value != "" { + crowdSecEnabledInSettings = strings.EqualFold(settingOverride.Value, "true") + logger.Log().WithFields(map[string]interface{}{ + "setting_value": settingOverride.Value, + "enabled": crowdSecEnabledInSettings, + }).Info("CrowdSec reconciliation: found existing Settings table preference") + } + + // Create SecurityConfig that matches Settings table state + crowdSecMode := "disabled" + if crowdSecEnabledInSettings { + crowdSecMode = "local" + } + + defaultCfg := models.SecurityConfig{ + UUID: "default", + Name: "Default Security Config", + Enabled: crowdSecEnabledInSettings, + CrowdSecMode: crowdSecMode, // ← NOW RESPECTS Settings table + WAFMode: "disabled", + WAFParanoiaLevel: 1, + RateLimitMode: "disabled", + RateLimitBurst: 10, + RateLimitRequests: 100, + RateLimitWindowSec: 60, + } + + if err := db.Create(&defaultCfg).Error; err != nil { + logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") + return + } + + logger.Log().WithFields(map[string]interface{}{ + "crowdsec_mode": defaultCfg.CrowdSecMode, + "enabled": defaultCfg.Enabled, + "source": "settings_table", + }).Info("CrowdSec reconciliation: default SecurityConfig created from Settings preference") + + // Continue to process the config (DON'T return early) + cfg = defaultCfg +} +``` + +**KEY CHANGES**: +1. **Check Settings table** during auto-initialization +2. **Create SecurityConfig matching Settings state** (not hardcoded "disabled") +3. **Don't return early** - let the rest of the function process the config +4. **Assign to cfg variable** so flow continues to line 74+ + +### Phase 2: Enhance Logging (IMMEDIATE) + +**FILE**: `backend/internal/services/crowdsec_startup.go` + +**CHANGE**: Lines 91-98 (decision logic - better logging) + +**AFTER**: +```go +// Start when EITHER SecurityConfig has mode="local" OR Settings table has enabled=true +// Exit only when BOTH are disabled +if cfg.CrowdSecMode != "local" && !crowdSecEnabled { + logger.Log().WithFields(map[string]interface{}{ + "db_mode": cfg.CrowdSecMode, + "setting_enabled": crowdSecEnabled, + }).Info("CrowdSec reconciliation skipped: both SecurityConfig and Settings indicate disabled") + return +} + +// Log which source triggered the start +if cfg.CrowdSecMode == "local" { + logger.Log().WithField("mode", cfg.CrowdSecMode).Info("CrowdSec reconciliation: starting based on SecurityConfig mode='local'") +} else if crowdSecEnabled { + logger.Log().WithField("setting", "true").Info("CrowdSec reconciliation: starting based on Settings table override") +} +``` + +**KEY CHANGES**: +1. **Change log level** from Debug to Info (so we see it in logs) +2. **Add source attribution** (which table triggered the start) +3. **Clarify condition** (exit only when BOTH are disabled) + +### Phase 3: Add Unified Toggle Endpoint (OPTIONAL BUT RECOMMENDED) + +**WHY**: Currently the toggle updates Settings, then calls Start/Stop which updates SecurityConfig. This creates potential race conditions. A unified endpoint is safer. + +**FILE**: `backend/internal/api/handlers/crowdsec_handler.go` + +**ADD**: New method (after Stop(), around line 260) + +```go +// ToggleCrowdSec enables or disables CrowdSec, synchronizing Settings and SecurityConfig atomically +func (h *CrowdsecHandler) ToggleCrowdSec(c *gin.Context) { + var payload struct { + Enabled bool `json:"enabled"` + } + if err := c.ShouldBindJSON(&payload); err != nil { + c.JSON(http.StatusBadRequest, gin.H{"error": "invalid payload"}) + return + } + + logger.Log().WithField("enabled", payload.Enabled).Info("CrowdSec toggle: received request") + + // Use a transaction to ensure Settings and SecurityConfig stay in sync + tx := h.DB.Begin() + defer func() { + if r := recover(); r != nil { + tx.Rollback() + } + }() + + // STEP 1: Update Settings table + settingKey := "security.crowdsec.enabled" + settingValue := "false" + if payload.Enabled { + settingValue = "true" + } + + var settingModel models.Setting + if err := tx.Where("key = ?", settingKey).FirstOrCreate(&settingModel, models.Setting{ + Key: settingKey, + Value: settingValue, + Type: "bool", + Category: "security", + }).Error; err != nil { + tx.Rollback() + logger.Log().WithError(err).Error("CrowdSec toggle: failed to update Settings table") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to update settings"}) + return + } + settingModel.Value = settingValue + if err := tx.Save(&settingModel).Error; err != nil { + tx.Rollback() + logger.Log().WithError(err).Error("CrowdSec toggle: failed to save Settings table") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to update settings"}) + return + } + + // STEP 2: Update SecurityConfig table + var cfg models.SecurityConfig + if err := tx.First(&cfg).Error; err != nil { + if err == gorm.ErrRecordNotFound { + // Create config matching toggle state + crowdSecMode := "disabled" + if payload.Enabled { + crowdSecMode = "local" + } + + cfg = models.SecurityConfig{ + UUID: "default", + Name: "Default Security Config", + Enabled: payload.Enabled, + CrowdSecMode: crowdSecMode, + WAFMode: "disabled", + WAFParanoiaLevel: 1, + RateLimitMode: "disabled", + RateLimitBurst: 10, + RateLimitRequests: 100, + RateLimitWindowSec: 60, + } + if err := tx.Create(&cfg).Error; err != nil { + tx.Rollback() + logger.Log().WithError(err).Error("CrowdSec toggle: failed to create SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to persist configuration"}) + return + } + } else { + tx.Rollback() + logger.Log().WithError(err).Error("CrowdSec toggle: failed to read SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to read configuration"}) + return + } + } else { + // Update existing config + if payload.Enabled { + cfg.CrowdSecMode = "local" + cfg.Enabled = true + } else { + cfg.CrowdSecMode = "disabled" + cfg.Enabled = false + } + if err := tx.Save(&cfg).Error; err != nil { + tx.Rollback() + logger.Log().WithError(err).Error("CrowdSec toggle: failed to update SecurityConfig") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to persist configuration"}) + return + } + } + + // Commit the transaction before starting/stopping process + if err := tx.Commit().Error; err != nil { + logger.Log().WithError(err).Error("CrowdSec toggle: transaction commit failed") + c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to commit changes"}) + return + } + + logger.Log().WithFields(map[string]interface{}{ + "enabled": cfg.Enabled, + "crowdsec_mode": cfg.CrowdSecMode, + }).Info("CrowdSec toggle: synchronized Settings and SecurityConfig successfully") + + // STEP 3: Start or stop the process + ctx := c.Request.Context() + if payload.Enabled { + // Start CrowdSec + pid, err := h.Executor.Start(ctx, h.BinPath, h.DataDir) + if err != nil { + logger.Log().WithError(err).Error("CrowdSec toggle: failed to start process, reverting DB changes") + + // Revert both tables (in new transaction) + revertTx := h.DB.Begin() + cfg.CrowdSecMode = "disabled" + cfg.Enabled = false + revertTx.Save(&cfg) + settingModel.Value = "false" + revertTx.Save(&settingModel) + revertTx.Commit() + + c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) + return + } + + // Wait for LAPI readiness + lapiReady := false + maxWait := 30 * time.Second + pollInterval := 500 * time.Millisecond + deadline := time.Now().Add(maxWait) + + for time.Now().Before(deadline) { + args := []string{"lapi", "status"} + if _, err := os.Stat(filepath.Join(h.DataDir, "config.yaml")); err == nil { + args = append([]string{"-c", filepath.Join(h.DataDir, "config.yaml")}, args...) + } + + checkCtx, cancel := context.WithTimeout(ctx, 2*time.Second) + _, err := h.CmdExec.Execute(checkCtx, "cscli", args...) + cancel() + + if err == nil { + lapiReady = true + break + } + + time.Sleep(pollInterval) + } + + logger.Log().WithFields(map[string]interface{}{ + "pid": pid, + "lapi_ready": lapiReady, + }).Info("CrowdSec toggle: started successfully") + + c.JSON(http.StatusOK, gin.H{ + "enabled": true, + "pid": pid, + "lapi_ready": lapiReady, + }) + return + } else { + // Stop CrowdSec + if err := h.Executor.Stop(ctx, h.DataDir); err != nil { + logger.Log().WithError(err).Error("CrowdSec toggle: failed to stop process") + c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) + return + } + + logger.Log().Info("CrowdSec toggle: stopped successfully") + c.JSON(http.StatusOK, gin.H{"enabled": false}) + return + } +} +``` + +**Register Route**: +```go +// In RegisterRoutes() method +rg.POST("/admin/crowdsec/toggle", h.ToggleCrowdSec) +``` + +**Frontend API Client** (`frontend/src/api/crowdsec.ts`): +```typescript +export async function toggleCrowdsec(enabled: boolean): Promise<{ enabled: boolean; pid?: number; lapi_ready?: boolean }> { + const response = await client.post('/admin/crowdsec/toggle', { enabled }) + return response.data +} +``` + +**Frontend Toggle Update** (`frontend/src/pages/Security.tsx`): +```tsx +const crowdsecPowerMutation = useMutation({ + mutationFn: async (enabled: boolean) => { + if (enabled) { + toast.info('Starting CrowdSec... This may take up to 30 seconds') + } + + // Use unified toggle endpoint (handles Settings + SecurityConfig + Process) + const result = await toggleCrowdsec(enabled) + + // Backend already verified state, just do final status check + const status = await statusCrowdsec() + if (enabled && !status.running) { + throw new Error('CrowdSec process failed to start. Check server logs for details.') + } + if (!enabled && status.running) { + throw new Error('CrowdSec process still running. Check server logs for details.') + } + + return result + }, + // ... rest remains the same +}) +``` + +--- + +## Testing Plan + +### Test 1: Fresh Install + +**Scenario**: Brand new Charon installation + +1. Start container: `docker compose up -d` +2. Navigate to Security page +3. Verify CrowdSec toggle shows OFF +4. Check status: `curl http://localhost:8080/api/v1/admin/crowdsec/status` + - Expected: `{"running": false}` +5. Check logs: `docker logs charon 2>&1 | grep "reconciliation"` + - Expected: "no SecurityConfig found, checking Settings table" + - Expected: "default SecurityConfig created from Settings preference" + - Expected: "crowdsec_mode: disabled" + +### Test 2: Toggle ON → Container Restart + +**Scenario**: User enables CrowdSec, then restarts container + +1. Enable toggle in UI (click ON) +2. Verify CrowdSec starts +3. Check status: `{"running": true, "pid": xxx}` +4. Restart: `docker restart charon` +5. Wait 10 seconds +6. Check status again: `{"running": true, "pid": xxx}` (NEW PID) +7. Check logs: + - Expected: "starting based on SecurityConfig mode='local'" + +### Test 3: Legacy Migration (Settings Table Only) + +**Scenario**: Existing install with Settings table but no SecurityConfig + +1. Manually set: `INSERT INTO settings (key, value, type, category) VALUES ('security.crowdsec.enabled', 'true', 'bool', 'security');` +2. Delete SecurityConfig: `DELETE FROM security_configs;` +3. Restart container +4. Check logs: + - Expected: "found existing Settings table preference" + - Expected: "default SecurityConfig created from Settings preference" + - Expected: "crowdsec_mode: local" +5. Check status: `{"running": true}` + +### Test 4: Toggle OFF → Container Restart + +**Scenario**: User disables CrowdSec, then restarts container + +1. Start with CrowdSec enabled and running +2. Click toggle OFF in UI +3. Verify process stops +4. Restart: `docker restart charon` +5. Wait 10 seconds +6. Check status: `{"running": false}` +7. Verify toggle still shows OFF + +### Test 5: Corrupted SecurityConfig Recovery + +**Scenario**: SecurityConfig gets deleted but Settings exists + +1. Enable CrowdSec via UI +2. Manually delete SecurityConfig: `DELETE FROM security_configs;` +3. Restart container +4. Verify auto-init recreates SecurityConfig matching Settings table +5. Verify CrowdSec auto-starts + +--- + +## Verification Checklist + +### Phase 1 (Auto-Initialization Fix) + +- [ ] Modified `crowdsec_startup.go` lines 46-71 +- [ ] Auto-init checks Settings table for existing preference +- [ ] Auto-init creates SecurityConfig matching Settings state +- [ ] Auto-init does NOT return early (continues to line 74+) +- [ ] Test 1 (Fresh Install) passes +- [ ] Test 3 (Legacy Migration) passes + +### Phase 2 (Logging Enhancement) + +- [ ] Modified `crowdsec_startup.go` lines 91-98 +- [ ] Changed log level from Debug to Info +- [ ] Added source attribution logging +- [ ] Test 2 (Toggle ON → Restart) shows correct log +- [ ] Test 4 (Toggle OFF → Restart) shows correct log + +### Phase 3 (Unified Toggle - Optional) + +- [ ] Added `ToggleCrowdSec()` method to `crowdsec_handler.go` +- [ ] Registered `/admin/crowdsec/toggle` route +- [ ] Added `toggleCrowdsec()` to `crowdsec.ts` +- [ ] Updated `crowdsecPowerMutation` in `Security.tsx` +- [ ] Test 4 (Toggle synchronization) passes +- [ ] Test 5 (Corrupted recovery) passes + +### Pre-Deployment + +- [ ] Pre-commit linters pass: `pre-commit run --all-files` +- [ ] Backend tests pass: `cd backend && go test ./...` +- [ ] Frontend tests pass: `cd frontend && npm run test` +- [ ] Docker build succeeds: `docker build -t charon:local .` +- [ ] Integration test passes: `scripts/crowdsec_integration.sh` + +--- + +## Success Criteria + +✅ **Fix is complete when**: + +1. Toggle shows correct state (ON = running, OFF = stopped) +2. Toggle persists across container restarts +3. Reconciliation logs clearly show decision reason +4. Auto-initialization respects Settings table preference +5. No "stuck toggle" scenarios +6. All 5 test cases pass +7. Pre-commit checks pass +8. No regressions in existing CrowdSec functionality + +--- + +## Risk Assessment + +| Change | Risk Level | Mitigation | +|--------|------------|------------| +| Phase 1 (Auto-init) | **Low** | Only affects fresh installs or corrupted state recovery | +| Phase 2 (Logging) | **Very Low** | Only changes log output, no logic changes | +| Phase 3 (Unified toggle) | **Medium** | New endpoint, requires thorough testing, but backward compatible | + +--- + +## Rollback Plan + +If issues arise: + +1. **Immediate Revert**: `git revert ` (no DB changes needed) +2. **Manual Fix** (if toggle stuck): + ```sql + -- Reset SecurityConfig + UPDATE security_configs + SET crowdsec_mode = 'disabled', enabled = 0 + WHERE uuid = 'default'; + + -- Reset Settings + UPDATE settings + SET value = 'false' + WHERE key = 'security.crowdsec.enabled'; + ``` +3. **Force Stop CrowdSec**: `docker exec charon pkill -SIGTERM crowdsec` + +--- + +## Dependency Impact Analysis + +### Phase 1: Auto-Initialization Changes (crowdsec_startup.go) + +#### Files Directly Modified +- `backend/internal/services/crowdsec_startup.go` (lines 46-71) + +#### Dependencies and Required Updates + +**1. Unit Tests - MUST BE UPDATED** +- **File**: `backend/internal/services/crowdsec_startup_test.go` +- **Impact**: Test `TestReconcileCrowdSecOnStartup_NoSecurityConfig` expects the function to skip/return early when no SecurityConfig exists +- **Required Change**: Update test to: + - Create a Settings table entry with `security.crowdsec.enabled = 'true'` + - Verify that SecurityConfig is auto-created with `crowdsec_mode = "local"` + - Verify that CrowdSec process is started (not skipped) +- **Additional Tests Needed**: + - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_SettingsDisabled` - Settings='false' → creates config with mode="disabled", does NOT start + - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_SettingsEnabled` - Settings='true' → creates config with mode="local", DOES start + - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_NoSettingsEntry` - No Settings entry → creates config with mode="disabled", does NOT start + +**2. Integration Tests - VERIFICATION NEEDED** +- **Files**: + - `scripts/crowdsec_integration.sh` + - `scripts/crowdsec_startup_test.sh` + - `scripts/crowdsec_decision_integration.sh` +- **Impact**: These scripts may assume specific startup behavior +- **Verification Required**: + - Do any scripts pre-populate Settings table? + - Do any scripts expect reconciliation to skip on fresh DB? + - Do any scripts verify log output from reconciliation? +- **Action**: Review scripts for assumptions about auto-initialization behavior + +**3. Migration/Upgrade Path - DATABASE CONCERN** +- **Scenario**: Existing installations with Settings='true' but missing SecurityConfig +- **Impact**: After upgrade, reconciliation will auto-create SecurityConfig from Settings (POSITIVE) +- **Risk**: Low - this is the intended fix +- **Documentation**: Should document this as expected behavior in migration guide + +**4. Models - NO CHANGES REQUIRED** +- **File**: `backend/internal/models/security_config.go` +- **Analysis**: SecurityConfig model structure unchanged +- **File**: `backend/internal/models/setting.go` +- **Analysis**: Setting model structure unchanged + +**5. Route Registration - NO CHANGES REQUIRED** +- **File**: `backend/internal/api/routes/routes.go` (line 360) +- **Analysis**: Already calls `ReconcileCrowdSecOnStartup`, no signature changes + +**6. Handler Dependencies - NO CHANGES REQUIRED** +- **File**: `backend/internal/api/handlers/crowdsec_handler.go` +- **Analysis**: Start/Stop handlers operate independently, no coupling to reconciliation logic + +### Phase 2: Logging Enhancement Changes (crowdsec_startup.go) + +#### Files Directly Modified +- `backend/internal/services/crowdsec_startup.go` (lines 91-98) + +#### Dependencies and Required Updates + +**1. Log Aggregation/Parsing - DOCUMENTATION UPDATE** +- **Concern**: Changing log level from Debug → Info increases log volume +- **Impact**: + - Logs will now appear in production (Info is default minimum level) + - Log aggregation tools may need filter updates if they parse specific messages +- **Required**: Update any log parsing scripts or documentation about expected log output + +**2. Integration Tests - POTENTIAL GREP PATTERNS** +- **Files**: `scripts/crowdsec_*.sh` +- **Impact**: If scripts `grep` for specific log messages, they may need updates +- **Action**: Search for log message expectations in scripts + +**3. Documentation - UPDATE REQUIRED** +- **File**: `docs/features.md` +- **Section**: CrowdSec Integration (line 167+) +- **Required Change**: Add note about reconciliation behavior: + ```markdown + #### Startup Behavior + + CrowdSec automatically starts on container restart if: + - SecurityConfig has `crowdsec_mode = "local"` OR + - Settings table has `security.crowdsec.enabled = "true"` + + Check container logs for reconciliation decisions: + - "CrowdSec reconciliation: starting based on SecurityConfig mode='local'" + - "CrowdSec reconciliation: starting based on Settings table override" + - "CrowdSec reconciliation skipped: both SecurityConfig and Settings indicate disabled" + ``` + +**4. Troubleshooting Guide - UPDATE RECOMMENDED** +- **File**: `docs/troubleshooting/` (if exists) or `docs/security.md` +- **Required Change**: Add section on "CrowdSec Not Starting After Restart" + - Explain reconciliation logic + - Show how to check Settings and SecurityConfig tables + - Show example log output + +### Phase 3: Unified Toggle Endpoint (OPTIONAL) + +#### Files Directly Modified +- `backend/internal/api/handlers/crowdsec_handler.go` (new method) +- `backend/internal/api/handlers/crowdsec_handler.go` (RegisterRoutes) +- `frontend/src/api/crowdsec.ts` (new function) +- `frontend/src/pages/Security.tsx` (mutation update) + +#### Dependencies and Required Updates + +**1. Handler Tests - NEW TESTS REQUIRED** +- **File**: `backend/internal/api/handlers/crowdsec_handler_test.go` +- **Required Tests**: + - `TestCrowdsecHandler_Toggle_EnableSuccess` + - `TestCrowdsecHandler_Toggle_DisableSuccess` + - `TestCrowdsecHandler_Toggle_TransactionRollback` (if Start fails) + - `TestCrowdsecHandler_Toggle_VerifyBothTablesUpdated` + +**2. Existing Handlers - DEPRECATION CONSIDERATION** +- **Files**: + - Start handler (line ~167 in crowdsec_handler.go) + - Stop handler (line ~260 in crowdsec_handler.go) +- **Impact**: New toggle endpoint duplicates Start/Stop functionality +- **Decision Required**: + - **Option A**: Keep both for backward compatibility (RECOMMENDED) + - **Option B**: Deprecate Start/Stop, add deprecation warnings + - **Option C**: Remove Start/Stop entirely (BREAKING CHANGE - NOT RECOMMENDED) +- **Recommendation**: Keep Start/Stop handlers unchanged, document toggle as "preferred method" + +**3. Frontend API Layer - MIGRATION PATH** +- **File**: `frontend/src/api/crowdsec.ts` +- **Current Exports**: `startCrowdsec`, `stopCrowdsec`, `statusCrowdsec` +- **After Change**: Add `toggleCrowdsec` to exports (line 75) +- **Backward Compatibility**: Keep existing functions, don't remove them + +**4. Frontend Component - LIMITED SCOPE** +- **File**: `frontend/src/pages/Security.tsx` +- **Impact**: Only `crowdsecPowerMutation` needs updating (lines 86-125) +- **Other Components**: No other components import these functions (verified) +- **Risk**: Low - isolated change + +**5. API Documentation - NEW ENDPOINT** +- **File**: `docs/api.md` (if exists) +- **Required Addition**: Document `/admin/crowdsec/toggle` endpoint + +**6. Integration Tests - NEW TEST CASE** +- **Files**: `scripts/crowdsec_integration.sh` +- **Required Addition**: Test toggle endpoint directly + +**7. Backward Compatibility - ANALYSIS** +- **Frontend**: Existing `/admin/crowdsec/start` and `/admin/crowdsec/stop` endpoints remain functional +- **API Consumers**: External tools using Start/Stop continue to work +- **Risk**: None - purely additive change + +### Cross-Cutting Concerns + +#### Database Migration +- **No schema changes required** - both Settings and SecurityConfig tables already exist +- **Data migration**: None needed - changes are behavioral only + +#### Configuration Files +- **No changes required** - no new environment variables or config files + +#### Docker/Deployment +- **No Dockerfile changes** - all changes are code-level +- **No docker-compose changes** - no new services or volumes + +#### Security Implications +- **Phase 1**: Improves security by respecting user's intent across restarts +- **Phase 2**: No security impact (logging only) +- **Phase 3**: Transaction safety prevents partial updates (improvement) + +#### Performance Considerations +- **Phase 1**: Adds one SQL query during auto-initialization (one-time, on startup) +- **Phase 2**: Minimal - only adds log statements +- **Phase 3**: Minimal - wraps existing logic in transaction + +#### Rollback Safety +- **All phases**: No database schema changes, can be rolled back via git revert +- **Data safety**: No data loss risk - only affects process startup behavior + +### Summary of Required File Updates + +| Phase | Files to Modify | Files to Create | Tests to Add | Docs to Update | +|-------|----------------|-----------------|--------------|----------------| +| **Phase 1** | `crowdsec_startup.go` | None | 3 new unit tests | None (covered in Phase 2) | +| **Phase 2** | `crowdsec_startup.go` | None | None | `features.md`, troubleshooting docs | +| **Phase 3** | `crowdsec_handler.go`, `crowdsec.ts`, `Security.tsx` | None | 4 new handler tests | `api.md` (if exists) | + +### Testing Matrix + +| Scenario | Phase 1 | Phase 2 | Phase 3 | +|----------|---------|---------|---------| +| Fresh install → toggle ON → restart | ✅ Fixes | ✅ Better logs | ✅ Cleaner code | +| Existing install with Settings='true', missing SecurityConfig | ✅ Fixes | ✅ Better logs | N/A | +| Toggle ON → restart → verify logs | ✅ Works | ✅ MUST verify new messages | ✅ Works | +| Toggle OFF → restart → verify logs | ✅ Works | ✅ MUST verify new messages | ✅ Works | +| Start/Stop handlers (backward compat) | N/A | N/A | ✅ MUST verify still work | + +### Missing from Original Plan + +The original plan DID NOT explicitly mention: + +1. **Unit test updates required** - Critical for Phase 1 (`TestReconcileCrowdSecOnStartup_NoSecurityConfig` needs major refactoring) +2. **Integration script verification** - May break if they expect specific behavior +3. **Documentation updates** - Features and troubleshooting guides need new reconciliation behavior documented +4. **Backward compatibility analysis for Phase 3** - Need explicit decision on Start/Stop handler fate +5. **API documentation** - New endpoint needs docs +6. **Testing matrix for all three phases together** - Need to verify they work in combination + +--- + +**END OF SPECIFICATION** diff --git a/docs/plans/current_spec.md b/docs/plans/current_spec.md index 117264a3..7e3cf56d 100644 --- a/docs/plans/current_spec.md +++ b/docs/plans/current_spec.md @@ -1,1005 +1,1846 @@ -# CrowdSec Toggle Integration Fix Plan +# CrowdSec Integration - Complete File Inventory -**Date**: December 15, 2025 -**Issue**: CrowdSec toggle stuck ON, reconciliation silently exits, process not starting -**Root Cause**: Database disconnect between frontend (Settings table) and reconciliation (SecurityConfig table) +**Generated:** 2025-12-15 +**Purpose:** Comprehensive documentation of all CrowdSec-related files in the Charon repository --- -## Executive Summary +## Table of Contents -The CrowdSec toggle shows "ON" but the process is NOT running. The reconciliation function silently exits without starting CrowdSec because: - -1. **Frontend writes to Settings table** (`security.crowdsec.enabled`) -2. **Backend reconciliation reads from SecurityConfig table** (`crowdsec_mode = "local"`) -3. **No synchronization** between the two tables -4. **Auto-initialization code EXISTS** (lines 46-71 in crowdsec_startup.go) but creates config with `crowdsec_mode = "disabled"` -5. **Reconciliation sees "disabled"** and exits silently with no logs +1. [Frontend Files](#frontend-files) +2. [Backend API Surface](#backend-api-surface) +3. [Backend Models](#backend-models) +4. [Backend Services](#backend-services) +5. [Caddy Integration](#caddy-integration) +6. [Configuration Files](#configuration-files) +7. [Scripts](#scripts) +8. [Documentation](#documentation) +9. [Test Coverage Summary](#test-coverage-summary) --- -## Root Cause Analysis (DETAILED) +## Frontend Files -### Evidence Trail +### 1. Pages & Components -**Container Logs Show Silent Exit**: -``` -{"bin_path":"crowdsec","data_dir":"/app/data/crowdsec","level":"info","msg":"CrowdSec reconciliation: starting startup check","time":"2025-12-14T23:32:33-05:00"} -[NO FURTHER LOGS - Function exited here] -``` +#### `/frontend/src/pages/CrowdSecConfig.tsx` +**Purpose:** Main CrowdSec configuration page +**Features:** +- CrowdSec mode selection (disabled/local) +- Import/Export configuration (tar.gz) +- File browser and editor for CrowdSec config files +- Preset management (list, pull, apply) +- Banned IP dashboard (list, ban, unban) +- Console enrollment UI +- Integration with live log viewer +**Test Files:** +- `/frontend/src/pages/__tests__/CrowdSecConfig.spec.tsx` +- `/frontend/src/pages/__tests__/CrowdSecConfig.test.tsx` +- `/frontend/src/pages/__tests__/CrowdSecConfig.coverage.test.tsx` -**Database State on Fresh Start**: -``` -SELECT * FROM security_configs → record not found -{"level":"info","msg":"CrowdSec reconciliation: no SecurityConfig found, creating default config"} -``` +#### `/frontend/src/pages/ImportCrowdSec.tsx` +**Purpose:** Dedicated import page for CrowdSec configuration +**Features:** +- File upload for CrowdSec config archives +- Automatic backup creation before import +- Success/error handling with redirects +**Test Files:** +- `/frontend/src/pages/__tests__/ImportCrowdSec.spec.tsx` +- `/frontend/src/pages/__tests__/ImportCrowdSec.test.tsx` -**Process Check**: -```bash -$ docker exec charon ps aux | grep -i crowdsec -[NO RESULTS - Process not running] -``` +#### `/frontend/src/pages/Security.tsx` +**Purpose:** Main security dashboard with CrowdSec toggle +**Features:** +- Layer 1 security card for CrowdSec +- Toggle control for start/stop CrowdSec process +- Status display (enabled/disabled, running, PID) +- Integration with security status API +- Navigation to CrowdSec config page +**Test Files:** +- `/frontend/src/pages/__tests__/Security.spec.tsx` +- `/frontend/src/pages/__tests__/Security.test.tsx` +- `/frontend/src/pages/__tests__/Security.loading.test.tsx` +- `/frontend/src/pages/__tests__/Security.dashboard.test.tsx` +- `/frontend/src/pages/__tests__/Security.audit.test.tsx` -### Why Reconciliation Exits Silently +#### `/frontend/src/components/Layout.tsx` +**Purpose:** Navigation layout with CrowdSec menu items +**Features:** +- Security menu: "CrowdSec" link to `/security/crowdsec` +- Tasks/Import menu: "CrowdSec" link to `/tasks/import/crowdsec` +**Test Files:** +- `/frontend/src/components/__tests__/Layout.test.tsx` (partial coverage) -**FILE**: `backend/internal/services/crowdsec_startup.go` +#### `/frontend/src/components/LiveLogViewer.tsx` +**Purpose:** Real-time log viewer with CrowdSec log support +**Features:** +- Filter by source type (includes "crowdsec" option) +- CrowdSec-specific color coding (purple-600) +**Test Files:** +- `/frontend/src/components/__tests__/LiveLogViewer.test.tsx` (includes CrowdSec filter test) -**Execution Flow**: -``` -1. User clicks toggle ON in Security.tsx -2. Frontend calls updateSetting('security.crowdsec.enabled', 'true') -3. Settings table updated → security.crowdsec.enabled = "true" -4. Frontend calls startCrowdsec() → Handler updates SecurityConfig -5. CrowdSec starts successfully, toggle shows ON -6. Container restarts (docker restart or reboot) -7. ReconcileCrowdSecOnStartup() executes at line 26: +#### `/frontend/src/components/LoadingStates.tsx` +**Purpose:** Loading overlays for security operations +**Features:** +- `ConfigReloadOverlay`: Used for CrowdSec operations +- Cerberus theme for security operations +**Test Files:** None specific to CrowdSec - Line 44: db.First(&cfg) → returns gorm.ErrRecordNotFound +#### `/frontend/src/components/AccessListForm.tsx` +**Purpose:** ACL form with CrowdSec guidance +**Features:** +- Help text suggesting CrowdSec for IP blocklists +**Test Files:** None specific to CrowdSec - Lines 46-71: Auto-initialization block executes: - - Creates SecurityConfig with crowdsec_mode = "disabled" - - Logs "default SecurityConfig created successfully" - - Returns early (line 70) WITHOUT checking Settings table - - CrowdSec is NEVER started +### 2. Hooks (React Query) - Result: Toggle shows "ON" (Settings table), but process is "OFF" (not running) -``` +#### `/frontend/src/hooks/useSecurity.ts` +**Purpose:** React Query hooks for security API +**Exports:** +- `useSecurityStatus()` - Fetches overall security status including CrowdSec +- `useSecurityConfig()` - Fetches security config (includes CrowdSec mode/URL) +- `useUpdateSecurityConfig()` - Mutation for updating security config +- `useDecisions(limit?)` - Fetches CrowdSec decisions (banned IPs) +- `useCreateDecision()` - Mutation for creating manual decisions +**Test Files:** +- `/frontend/src/hooks/__tests__/useSecurity.test.tsx` (includes useDecisions tests) -**THE BUG (Lines 46-71)**: -```go -if err == gorm.ErrRecordNotFound { - // AUTO-INITIALIZE: Create default SecurityConfig on first startup - logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, creating default config") +#### `/frontend/src/hooks/useConsoleEnrollment.ts` +**Purpose:** React Query hooks for CrowdSec Console enrollment +**Exports:** +- `useConsoleStatus(enabled?)` - Fetches console enrollment status +- `useEnrollConsole()` - Mutation for enrolling with CrowdSec Console +**Test Files:** None - defaultCfg := models.SecurityConfig{ - UUID: "default", - Name: "Default Security Config", - Enabled: false, - CrowdSecMode: "disabled", // ← PROBLEM: Ignores Settings table state - WAFMode: "disabled", - WAFParanoiaLevel: 1, - RateLimitMode: "disabled", - RateLimitBurst: 10, - RateLimitRequests: 100, - RateLimitWindowSec: 60, - } +### 3. API Clients - if err := db.Create(&defaultCfg).Error; err != nil { - logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") - return - } +#### `/frontend/src/api/crowdsec.ts` +**Purpose:** Primary CrowdSec API client +**Exports:** +- **Types:** + - `CrowdSecDecision` - Decision/ban record interface + - `CrowdSecStatus` - Process status interface +- **Functions:** + - `startCrowdsec()` - Start CrowdSec process + - `stopCrowdsec()` - Stop CrowdSec process + - `statusCrowdsec()` - Get process status + - `importCrowdsecConfig(file)` - Upload config archive + - `exportCrowdsecConfig()` - Download config archive + - `listCrowdsecFiles()` - List config files + - `readCrowdsecFile(path)` - Read config file content + - `writeCrowdsecFile(path, content)` - Write config file + - `listCrowdsecDecisions()` - List banned IPs + - `banIP(ip, duration, reason)` - Add ban decision + - `unbanIP(ip)` - Remove ban decision +**Test Files:** +- `/frontend/src/api/__tests__/crowdsec.test.ts` - logger.Log().Info("CrowdSec reconciliation: default SecurityConfig created successfully") - // Don't start CrowdSec on fresh install - user must enable via UI - return // ← EXITS WITHOUT checking Settings table or starting process -} -``` +#### `/frontend/src/api/presets.ts` +**Purpose:** CrowdSec preset management API client +**Exports:** +- **Types:** + - `CrowdsecPresetSummary` - Preset metadata + - `PullCrowdsecPresetResponse` - Pull operation response + - `ApplyCrowdsecPresetResponse` - Apply operation response + - `CachedCrowdsecPresetPreview` - Cached preset data +- **Functions:** + - `listCrowdsecPresets()` - List available presets + - `getCrowdsecPresets()` - Alias for list + - `pullCrowdsecPreset(slug)` - Fetch preset from remote + - `applyCrowdsecPreset(payload)` - Apply preset to config + - `getCrowdsecPresetCache(slug)` - Get cached preset preview +**Test Files:** None -**Why This Causes the Issue**: +#### `/frontend/src/api/consoleEnrollment.ts` +**Purpose:** CrowdSec Console enrollment API client +**Exports:** +- **Types:** + - `ConsoleEnrollmentStatus` - Enrollment status interface + - `ConsoleEnrollPayload` - Enrollment request payload +- **Functions:** + - `getConsoleStatus()` - Fetch enrollment status + - `enrollConsole(payload)` - Enroll with CrowdSec Console +**Test Files:** None -1. **First Container Start**: User enables CrowdSec via toggle - - Settings: `security.crowdsec.enabled = "true"` ✅ - - SecurityConfig: `crowdsec_mode = "local"` ✅ (via Start handler) - - Process: Running ✅ +#### `/frontend/src/api/security.ts` +**Purpose:** General security API client (includes CrowdSec-related types) +**Exports (CrowdSec-related):** +- **Types:** + - `SecurityStatus` - Includes `crowdsec` object with mode, enabled, api_url + - `SecurityConfigPayload` - Includes `crowdsec_mode`, `crowdsec_api_url` + - `CreateDecisionPayload` - Manual decision creation +- **Functions:** + - `getSecurityStatus()` - Fetch security status (includes CrowdSec state) + - `getSecurityConfig()` - Fetch security config + - `updateSecurityConfig(payload)` - Update security config + - `getDecisions(limit?)` - Fetch decisions list + - `createDecision(payload)` - Create manual decision +**Test Files:** +- Various security test files reference CrowdSec status -2. **Container Restart**: Database persists but SecurityConfig table may be empty (migration issue or corruption) - - Reconciliation runs - - SecurityConfig table: **EMPTY** (record lost or never migrated) - - Auto-init creates SecurityConfig with `crowdsec_mode = "disabled"` - - Returns early without checking Settings table - - Settings: Still shows `"true"` (UI says ON) - - SecurityConfig: Says `"disabled"` (reconciliation source) - - Process: NOT started ❌ +### 4. Data & Utilities -3. **Result**: **State Mismatch** - - Frontend toggle: **ON** (reads Settings table) - - Backend reconciliation: **OFF** (reads SecurityConfig table) - - Process: **NOT RUNNING** (reconciliation didn't start it) +#### `/frontend/src/data/crowdsecPresets.ts` +**Purpose:** Static CrowdSec preset definitions (local fallback) +**Exports:** +- `CrowdsecPreset` - Preset interface +- `CROWDSEC_PRESETS` - Array of built-in presets: + - bot-mitigation-essentials + - honeypot-friendly-defaults + - geolocation-aware +- `findCrowdsecPreset(slug)` - Lookup function +**Test Files:** None + +#### `/frontend/src/utils/crowdsecExport.ts` +**Purpose:** Utility functions for CrowdSec export operations +**Exports:** +- `buildCrowdsecExportFilename()` - Generate timestamped filename +- `promptCrowdsecFilename(default?)` - User prompt with sanitization +- `downloadCrowdsecExport(blob, filename)` - Trigger browser download +**Test Files:** None --- -## Current Code Analysis +## Backend API Surface -### 1. Reconciliation Function (crowdsec_startup.go) +### 1. Main Handler -**Location**: `backend/internal/services/crowdsec_startup.go` +#### `/backend/internal/api/handlers/crowdsec_handler.go` +**Purpose:** Primary CrowdSec API handler with all endpoints +**Type:** `CrowdsecHandler` +**Dependencies:** +- `db *gorm.DB` - Database connection +- `Executor CrowdsecExecutor` - Process control interface +- `BinPath string` - Path to crowdsec binary +- `DataDir string` - CrowdSec data directory path +- `Security *SecurityService` - Security config service -**Lines 44-71 (Auto-initialization - THE BUG)**: -```go -var cfg models.SecurityConfig -if err := db.First(&cfg).Error; err != nil { - if err == gorm.ErrRecordNotFound { - // AUTO-INITIALIZE: Create default SecurityConfig on first startup - logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, creating default config") +**Methods (26 total):** - defaultCfg := models.SecurityConfig{ - UUID: "default", - Name: "Default Security Config", - Enabled: false, - CrowdSecMode: "disabled", // ← IGNORES Settings table - WAFMode: "disabled", - WAFParanoiaLevel: 1, - RateLimitMode: "disabled", - RateLimitBurst: 10, - RateLimitRequests: 100, - RateLimitWindowSec: 60, - } +1. **Process Control:** + - `Start(c *gin.Context)` - POST `/admin/crowdsec/start` + - `Stop(c *gin.Context)` - POST `/admin/crowdsec/stop` + - `Status(c *gin.Context)` - GET `/admin/crowdsec/status` - if err := db.Create(&defaultCfg).Error; err != nil { - logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") - return - } +2. **Configuration Management:** + - `ImportConfig(c *gin.Context)` - POST `/admin/crowdsec/import` + - `ExportConfig(c *gin.Context)` - GET `/admin/crowdsec/export` + - `ListFiles(c *gin.Context)` - GET `/admin/crowdsec/files` + - `ReadFile(c *gin.Context)` - GET `/admin/crowdsec/file?path=...` + - `WriteFile(c *gin.Context)` - POST `/admin/crowdsec/file` + - `GetAcquisitionConfig(c *gin.Context)` - GET `/admin/crowdsec/acquisition` + - `UpdateAcquisitionConfig(c *gin.Context)` - POST `/admin/crowdsec/acquisition` - logger.Log().Info("CrowdSec reconciliation: default SecurityConfig created successfully") - // Don't start CrowdSec on fresh install - user must enable via UI - return // ← EARLY EXIT - Never checks Settings table - } - logger.Log().WithError(err).Warn("CrowdSec reconciliation: failed to read SecurityConfig") - return -} -``` +3. **Preset Management:** + - `ListPresets(c *gin.Context)` - GET `/admin/crowdsec/presets` + - `PullPreset(c *gin.Context)` - POST `/admin/crowdsec/presets/pull` + - `ApplyPreset(c *gin.Context)` - POST `/admin/crowdsec/presets/apply` + - `GetCachedPreset(c *gin.Context)` - GET `/admin/crowdsec/presets/cache/:slug` -**Lines 74-90 (Runtime Setting Override - UNREACHABLE after auto-init)**: -```go -// Also check for runtime setting override in settings table -var settingOverride struct{ Value string } -crowdSecEnabled := false -if err := db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", "security.crowdsec.enabled").Scan(&settingOverride).Error; err == nil && settingOverride.Value != "" { - crowdSecEnabled = strings.EqualFold(settingOverride.Value, "true") - logger.Log().WithFields(map[string]interface{}{ - "setting_value": settingOverride.Value, - "crowdsec_enabled": crowdSecEnabled, - }).Debug("CrowdSec reconciliation: found runtime setting override") -} -``` +4. **Console Enrollment:** + - `ConsoleEnroll(c *gin.Context)` - POST `/admin/crowdsec/console/enroll` + - `ConsoleStatus(c *gin.Context)` - GET `/admin/crowdsec/console/status` -**This code is NEVER REACHED** when SecurityConfig doesn't exist because line 70 returns early! +5. **Decision Management (Banned IPs):** + - `ListDecisions(c *gin.Context)` - GET `/admin/crowdsec/decisions` (via cscli) + - `GetLAPIDecisions(c *gin.Context)` - GET `/admin/crowdsec/decisions/lapi` (via LAPI) + - `CheckLAPIHealth(c *gin.Context)` - GET `/admin/crowdsec/lapi/health` + - `BanIP(c *gin.Context)` - POST `/admin/crowdsec/ban` + - `UnbanIP(c *gin.Context)` - DELETE `/admin/crowdsec/ban/:ip` -**Lines 91-98 (Decision Logic)**: -```go -// Only auto-start if CrowdSecMode is "local" OR runtime setting is enabled -if cfg.CrowdSecMode != "local" && !crowdSecEnabled { - logger.Log().WithFields(map[string]interface{}{ - "db_mode": cfg.CrowdSecMode, - "setting_enabled": crowdSecEnabled, - }).Debug("CrowdSec reconciliation skipped: mode is not 'local' and setting not enabled") - return -} -``` +6. **Bouncer Registration:** + - `RegisterBouncer(c *gin.Context)` - POST `/admin/crowdsec/bouncer/register` -**Also UNREACHABLE** during auto-init scenario! +7. **Helper Methods:** + - `isCerberusEnabled() bool` - Check if Cerberus feature flag is enabled + - `isConsoleEnrollmentEnabled() bool` - Check if console enrollment is enabled + - `hubEndpoints() []string` - Return hub API URLs + - `RegisterRoutes(rg *gin.RouterGroup)` - Route registration -### 2. Start Handler (crowdsec_handler.go) +**Test Files:** +- `/backend/internal/api/handlers/crowdsec_handler_test.go` - Core unit tests +- `/backend/internal/api/handlers/crowdsec_handler_comprehensive_test.go` - Comprehensive tests +- `/backend/internal/api/handlers/crowdsec_handler_coverage_test.go` - Coverage boost +- `/backend/internal/api/handlers/crowdsec_coverage_boost_test.go` - Additional coverage +- `/backend/internal/api/handlers/crowdsec_coverage_target_test.go` - Target tests +- `/backend/internal/api/handlers/crowdsec_cache_verification_test.go` - Cache tests +- `/backend/internal/api/handlers/crowdsec_decisions_test.go` - Decision endpoint tests +- `/backend/internal/api/handlers/crowdsec_lapi_test.go` - LAPI tests +- `/backend/internal/api/handlers/crowdsec_presets_handler_test.go` - Preset tests +- `/backend/internal/api/handlers/crowdsec_pull_apply_integration_test.go` - Integration tests -**Location**: `backend/internal/api/handlers/crowdsec_handler.go` +#### `/backend/internal/api/handlers/crowdsec_exec.go` +**Purpose:** Process executor interface and implementation +**Exports:** +- `CrowdsecExecutor` interface - Process control abstraction +- `DefaultCrowdsecExecutor` - Production implementation +- Helper functions for process management +**Test Files:** +- `/backend/internal/api/handlers/crowdsec_exec_test.go` -**Lines 167-192 - CORRECT IMPLEMENTATION**: -```go -func (h *CrowdsecHandler) Start(c *gin.Context) { - ctx := c.Request.Context() +### 2. Security Handler (CrowdSec Integration) - // UPDATE SecurityConfig to persist user's intent - var cfg models.SecurityConfig - if err := h.DB.First(&cfg).Error; err != nil { - if err == gorm.ErrRecordNotFound { - // Create default config with CrowdSec enabled - cfg = models.SecurityConfig{ - UUID: "default", - Name: "Default Security Config", - Enabled: true, - CrowdSecMode: "local", // ← CORRECT: Sets mode to "local" - } - if err := h.DB.Create(&cfg).Error; err != nil { - logger.Log().WithError(err).Error("Failed to create SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to persist configuration"}) - return - } - } else { - logger.Log().WithError(err).Error("Failed to read SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to read configuration"}) - return - } - } else { - // Update existing config - cfg.CrowdSecMode = "local" - cfg.Enabled = true - if err := h.DB.Save(&cfg).Error; err != nil { - logger.Log().WithError(err).Error("Failed to update SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to persist configuration"}) - return - } - } +#### `/backend/internal/api/handlers/security_handler.go` +**Purpose:** Security configuration handler (includes CrowdSec mode) +**Methods (CrowdSec-related):** +- `GetStatus(c *gin.Context)` - Returns security status including CrowdSec enabled flag +- `GetConfig(c *gin.Context)` - Returns SecurityConfig including CrowdSec mode/URL +- `UpdateConfig(c *gin.Context)` - Updates security config (can change CrowdSec mode) +- `ListDecisions(c *gin.Context)` - Lists security decisions (includes CrowdSec decisions) +- `CreateDecision(c *gin.Context)` - Creates manual decision - // Start the process... -} -``` +**Routes:** +- GET `/security/status` - Security status +- GET `/security/config` - Security config +- POST `/security/config` - Update config +- GET `/security/decisions` - List decisions +- POST `/security/decisions` - Create decision -**Analysis**: This is CORRECT. The Start handler properly updates SecurityConfig when user clicks "Start" from the CrowdSec config page (/security/crowdsec). - -### 3. Frontend Toggle (Security.tsx) - -**Location**: `frontend/src/pages/Security.tsx` - -**Lines 64-120 - THE DISCONNECT**: -```tsx -const crowdsecPowerMutation = useMutation({ - mutationFn: async (enabled: boolean) => { - // Step 1: Update Settings table - await updateSetting('security.crowdsec.enabled', enabled ? 'true' : 'false', 'security', 'bool') - - if (enabled) { - // Step 2: Call Start() which updates SecurityConfig - const result = await startCrowdsec() - - // Step 3: Verify running - const status = await statusCrowdsec() - if (!status.running) { - await updateSetting('security.crowdsec.enabled', 'false', 'security', 'bool') - throw new Error('CrowdSec process failed to start') - } - - return result - } else { - // Step 2: Call Stop() which DOES NOT update SecurityConfig! - await stopCrowdsec() - - // Step 3: Verify stopped - await new Promise(resolve => setTimeout(resolve, 500)) - const status = await statusCrowdsec() - if (status.running) { - throw new Error('CrowdSec process still running') - } - - return { enabled: false } - } - }, -}) -``` - -**Analysis**: -- **Enable Path**: Updates Settings → Calls Start() → Start() updates SecurityConfig → ✅ Both tables synced -- **Disable Path**: Updates Settings → Calls Stop() → Stop() **does NOT always update SecurityConfig** → ❌ Tables out of sync - -Looking at the Stop handler: -```go -func (h *CrowdsecHandler) Stop(c *gin.Context) { - ctx := c.Request.Context() - if err := h.Executor.Stop(ctx, h.DataDir); err != nil { - c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) - return - } - - // UPDATE SecurityConfig to persist user's intent - var cfg models.SecurityConfig - if err := h.DB.First(&cfg).Error; err == nil { - cfg.CrowdSecMode = "disabled" - cfg.Enabled = false - if err := h.DB.Save(&cfg).Error; err != nil { - logger.Log().WithError(err).Warn("Failed to update SecurityConfig after stopping CrowdSec") - } - } - - c.JSON(http.StatusOK, gin.H{"status": "stopped"}) -} -``` - -**This IS CORRECT** - Stop() handler updates SecurityConfig when it can find it. BUT: - -**Scenario Where It Fails**: -1. SecurityConfig table gets corrupted/cleared/migrated incorrectly -2. User clicks toggle OFF -3. Stop() tries to update SecurityConfig → record not found → skips update -4. Settings table still updated to "false" -5. Container restarts → auto-init creates SecurityConfig with "disabled" -6. Both tables say "disabled" but UI might show stale state +**Test Files:** +- Various security handler tests --- -## Comprehensive Fix Strategy +## Backend Models -### Phase 1: Fix Auto-Initialization (CRITICAL - IMMEDIATE) +### 1. Database Models -**FILE**: `backend/internal/services/crowdsec_startup.go` +#### `/backend/internal/models/crowdsec_preset_event.go` +**Purpose:** Audit trail for preset operations +**Table:** `crowdsec_preset_events` +**Fields:** +- `ID uint` - Primary key +- `Slug string` - Preset slug identifier +- `Action string` - "pull" or "apply" +- `Status string` - "success", "failed" +- `CacheKey string` - Cache identifier +- `BackupPath string` - Backup file path (for apply) +- `Error string` - Error message if failed +- `CreatedAt time.Time` +- `UpdatedAt time.Time` +**Test Files:** None -**CHANGE**: Lines 46-71 (auto-initialization block) +#### `/backend/internal/models/crowdsec_console_enrollment.go` +**Purpose:** CrowdSec Console enrollment state +**Table:** `crowdsec_console_enrollments` +**Fields:** +- `ID uint` - Primary key +- `UUID string` - Unique identifier +- `Status string` - "pending", "enrolled", "failed" +- `Tenant string` - Console tenant name +- `AgentName string` - Agent display name +- `EncryptedEnrollKey string` - Encrypted enrollment key +- `LastError string` - Last error message +- `LastCorrelationID string` - Last API correlation ID +- `LastAttemptAt *time.Time` - Last enrollment attempt +- `EnrolledAt *time.Time` - Successful enrollment timestamp +- `LastHeartbeatAt *time.Time` - Last heartbeat from console +- `CreatedAt time.Time` +- `UpdatedAt time.Time` +**Test Files:** None -**AFTER** (with Settings table check): -```go -if err == gorm.ErrRecordNotFound { - // AUTO-INITIALIZE: Create default SecurityConfig by checking Settings table - logger.Log().Info("CrowdSec reconciliation: no SecurityConfig found, checking Settings table for user preference") +#### `/backend/internal/models/security_config.go` +**Purpose:** Global security configuration (includes CrowdSec settings) +**Table:** `security_configs` +**Fields (CrowdSec-related):** +- `CrowdSecMode string` - "disabled" or "local" +- `CrowdSecAPIURL string` - LAPI URL (default: http://127.0.0.1:8085) +**Other Fields:** WAF, Rate Limit, ACL, admin whitelist, break glass +**Test Files:** Various security tests - // Check if user has already enabled CrowdSec via Settings table (from toggle or legacy config) - var settingOverride struct{ Value string } - crowdSecEnabledInSettings := false - if err := db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", "security.crowdsec.enabled").Scan(&settingOverride).Error; err == nil && settingOverride.Value != "" { - crowdSecEnabledInSettings = strings.EqualFold(settingOverride.Value, "true") - logger.Log().WithFields(map[string]interface{}{ - "setting_value": settingOverride.Value, - "enabled": crowdSecEnabledInSettings, - }).Info("CrowdSec reconciliation: found existing Settings table preference") - } +#### `/backend/internal/models/security_decision.go` +**Purpose:** Security decisions from CrowdSec/WAF/Rate Limit +**Table:** `security_decisions` +**Fields:** +- `ID uint` - Primary key +- `UUID string` - Unique identifier +- `Source string` - "crowdsec", "waf", "ratelimit", "manual" +- `Action string` - "allow", "block", "challenge" +- `IP string` - IP address +- `Host string` - Hostname (optional) +- `RuleID string` - Rule or scenario ID +- `Details string` - JSON details +- `CreatedAt time.Time` +**Test Files:** None - // Create SecurityConfig that matches Settings table state - crowdSecMode := "disabled" - if crowdSecEnabledInSettings { - crowdSecMode = "local" - } +--- - defaultCfg := models.SecurityConfig{ - UUID: "default", - Name: "Default Security Config", - Enabled: crowdSecEnabledInSettings, - CrowdSecMode: crowdSecMode, // ← NOW RESPECTS Settings table - WAFMode: "disabled", - WAFParanoiaLevel: 1, - RateLimitMode: "disabled", - RateLimitBurst: 10, - RateLimitRequests: 100, - RateLimitWindowSec: 60, - } +## Backend Services - if err := db.Create(&defaultCfg).Error; err != nil { - logger.Log().WithError(err).Error("CrowdSec reconciliation: failed to create default SecurityConfig") - return - } +### 1. Startup Reconciliation - logger.Log().WithFields(map[string]interface{}{ - "crowdsec_mode": defaultCfg.CrowdSecMode, - "enabled": defaultCfg.Enabled, - "source": "settings_table", - }).Info("CrowdSec reconciliation: default SecurityConfig created from Settings preference") +#### `/backend/internal/services/crowdsec_startup.go` +**Purpose:** Reconcile CrowdSec state on container restart +**Exports:** +- `CrowdsecProcessManager` interface - Process management abstraction +- `ReconcileCrowdSecOnStartup(db, executor, binPath, dataDir)` - Main reconciliation function - // Continue to process the config (DON'T return early) - cfg = defaultCfg -} -``` +**Logic:** +1. Check if SecurityConfig table exists +2. Check if CrowdSecMode = "local" in SecurityConfig +3. Fallback: Check Settings table for "security.crowdsec.enabled" +4. If enabled, start CrowdSec process +5. Log all actions for debugging -**KEY CHANGES**: -1. **Check Settings table** during auto-initialization -2. **Create SecurityConfig matching Settings state** (not hardcoded "disabled") -3. **Don't return early** - let the rest of the function process the config -4. **Assign to cfg variable** so flow continues to line 74+ +**Called From:** `/backend/internal/api/routes/routes.go` (on server startup) +**Test Files:** +- `/backend/internal/services/crowdsec_startup_test.go` -### Phase 2: Enhance Logging (IMMEDIATE) +--- -**FILE**: `backend/internal/services/crowdsec_startup.go` +## Caddy Integration -**CHANGE**: Lines 91-98 (decision logic - better logging) +### 1. Config Generation -**AFTER**: -```go -// Start when EITHER SecurityConfig has mode="local" OR Settings table has enabled=true -// Exit only when BOTH are disabled -if cfg.CrowdSecMode != "local" && !crowdSecEnabled { - logger.Log().WithFields(map[string]interface{}{ - "db_mode": cfg.CrowdSecMode, - "setting_enabled": crowdSecEnabled, - }).Info("CrowdSec reconciliation skipped: both SecurityConfig and Settings indicate disabled") - return -} +#### `/backend/internal/caddy/config.go` +**Purpose:** Generate Caddy config with CrowdSec bouncer +**Function:** `generateConfig(..., crowdsecEnabled bool, ...)` +**Logic:** +- If `crowdsecEnabled=true`, inject CrowdSec bouncer handler into route chain +- Handler: `{"handler": "crowdsec"}` (Caddy bouncer module) +**Test Files:** +- `/backend/internal/caddy/config_crowdsec_test.go` -// Log which source triggered the start -if cfg.CrowdSecMode == "local" { - logger.Log().WithField("mode", cfg.CrowdSecMode).Info("CrowdSec reconciliation: starting based on SecurityConfig mode='local'") -} else if crowdSecEnabled { - logger.Log().WithField("setting", "true").Info("CrowdSec reconciliation: starting based on Settings table override") -} -``` +#### `/backend/internal/caddy/manager.go` +**Purpose:** Caddy config manager with CrowdSec flag computation +**Method:** `computeEffectiveFlags(ctx)` +**Returns:** `cerberusEnabled, aclEnabled, wafEnabled, rateLimitEnabled, crowdsecEnabled` +**Logic:** +1. Check SecurityConfig.CrowdSecMode +2. Check Settings table "security.crowdsec.enabled" override +3. Merge with cerberusEnabled flag +**Test Files:** +- `/backend/internal/caddy/manager_additional_test.go` (includes CrowdSec toggle test) -**KEY CHANGES**: -1. **Change log level** from Debug to Info (so we see it in logs) -2. **Add source attribution** (which table triggered the start) -3. **Clarify condition** (exit only when BOTH are disabled) +--- -### Phase 3: Add Unified Toggle Endpoint (OPTIONAL BUT RECOMMENDED) +## Configuration Files -**WHY**: Currently the toggle updates Settings, then calls Start/Stop which updates SecurityConfig. This creates potential race conditions. A unified endpoint is safer. +### 1. CrowdSec Config Templates -**FILE**: `backend/internal/api/handlers/crowdsec_handler.go` +#### `/configs/crowdsec/acquis.yaml` +**Purpose:** CrowdSec acquisition config (log sources) +**Content:** +- Defines Caddy access log path: `/data/charon/data/logs/access.log` +- Log type: `caddy` -**ADD**: New method (after Stop(), around line 260) +#### `/configs/crowdsec/install_hub_items.sh` +**Purpose:** Install CrowdSec hub items (parsers, scenarios, collections) +**Content:** +- Install parsers: http-logs, nginx-logs, apache2-logs, syslog-logs, geoip-enrich +- Install scenarios: http-probing, sensitive-files, backdoors-attempts, path-traversal +- Install collections: base-http-scenarios +**Usage:** Called during CrowdSec setup -```go -// ToggleCrowdSec enables or disables CrowdSec, synchronizing Settings and SecurityConfig atomically -func (h *CrowdsecHandler) ToggleCrowdSec(c *gin.Context) { - var payload struct { - Enabled bool `json:"enabled"` - } - if err := c.ShouldBindJSON(&payload); err != nil { - c.JSON(http.StatusBadRequest, gin.H{"error": "invalid payload"}) - return - } +#### `/configs/crowdsec/register_bouncer.sh` +**Purpose:** Register Caddy bouncer with CrowdSec LAPI +**Content:** +- Script to register bouncer and save API key +**Usage:** Called during bouncer registration - logger.Log().WithField("enabled", payload.Enabled).Info("CrowdSec toggle: received request") +--- - // Use a transaction to ensure Settings and SecurityConfig stay in sync - tx := h.DB.Begin() - defer func() { - if r := recover(); r != nil { - tx.Rollback() - } - }() +## Scripts - // STEP 1: Update Settings table - settingKey := "security.crowdsec.enabled" - settingValue := "false" - if payload.Enabled { - settingValue = "true" - } +### 1. Integration Tests - var settingModel models.Setting - if err := tx.Where("key = ?", settingKey).FirstOrCreate(&settingModel, models.Setting{ - Key: settingKey, - Value: settingValue, - Type: "bool", - Category: "security", - }).Error; err != nil { - tx.Rollback() - logger.Log().WithError(err).Error("CrowdSec toggle: failed to update Settings table") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to update settings"}) - return - } - settingModel.Value = settingValue - if err := tx.Save(&settingModel).Error; err != nil { - tx.Rollback() - logger.Log().WithError(err).Error("CrowdSec toggle: failed to save Settings table") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to update settings"}) - return - } +#### `/scripts/crowdsec_integration.sh` +**Purpose:** Integration test for CrowdSec start/stop/status +**Test Cases:** +- Start CrowdSec process +- Check status endpoint +- Stop CrowdSec process +**Usage:** Run via task "Integration: Coraza WAF" - // STEP 2: Update SecurityConfig table - var cfg models.SecurityConfig - if err := tx.First(&cfg).Error; err != nil { - if err == gorm.ErrRecordNotFound { - // Create config matching toggle state - crowdSecMode := "disabled" - if payload.Enabled { - crowdSecMode = "local" - } +#### `/scripts/crowdsec_decision_integration.sh` +**Purpose:** Integration test for decision management +**Test Cases:** +- List decisions +- Ban IP +- Unban IP +- LAPI health check +**Usage:** Run via task "Integration: CrowdSec Decisions" - cfg = models.SecurityConfig{ - UUID: "default", - Name: "Default Security Config", - Enabled: payload.Enabled, - CrowdSecMode: crowdSecMode, - WAFMode: "disabled", - WAFParanoiaLevel: 1, - RateLimitMode: "disabled", - RateLimitBurst: 10, - RateLimitRequests: 100, - RateLimitWindowSec: 60, - } - if err := tx.Create(&cfg).Error; err != nil { - tx.Rollback() - logger.Log().WithError(err).Error("CrowdSec toggle: failed to create SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to persist configuration"}) - return - } - } else { - tx.Rollback() - logger.Log().WithError(err).Error("CrowdSec toggle: failed to read SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to read configuration"}) - return - } - } else { - // Update existing config - if payload.Enabled { - cfg.CrowdSecMode = "local" - cfg.Enabled = true - } else { - cfg.CrowdSecMode = "disabled" - cfg.Enabled = false - } - if err := tx.Save(&cfg).Error; err != nil { - tx.Rollback() - logger.Log().WithError(err).Error("CrowdSec toggle: failed to update SecurityConfig") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to persist configuration"}) - return - } - } +#### `/scripts/crowdsec_startup_test.sh` +**Purpose:** Test CrowdSec startup reconciliation +**Test Cases:** +- Verify ReconcileCrowdSecOnStartup works +- Check process starts on container restart +**Usage:** Run via task "Integration: CrowdSec Startup" - // Commit the transaction before starting/stopping process - if err := tx.Commit().Error; err != nil { - logger.Log().WithError(err).Error("CrowdSec toggle: transaction commit failed") - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to commit changes"}) - return - } +--- - logger.Log().WithFields(map[string]interface{}{ - "enabled": cfg.Enabled, - "crowdsec_mode": cfg.CrowdSecMode, - }).Info("CrowdSec toggle: synchronized Settings and SecurityConfig successfully") +## Documentation - // STEP 3: Start or stop the process - ctx := c.Request.Context() - if payload.Enabled { - // Start CrowdSec - pid, err := h.Executor.Start(ctx, h.BinPath, h.DataDir) - if err != nil { - logger.Log().WithError(err).Error("CrowdSec toggle: failed to start process, reverting DB changes") +### 1. Plans & Specifications - // Revert both tables (in new transaction) - revertTx := h.DB.Begin() - cfg.CrowdSecMode = "disabled" - cfg.Enabled = false - revertTx.Save(&cfg) - settingModel.Value = "false" - revertTx.Save(&settingModel) - revertTx.Commit() +#### `/docs/plans/crowdsec_full_implementation.md` +**Purpose:** Complete implementation plan for CrowdSec integration - c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) - return - } +#### `/docs/plans/crowdsec_testing_plan.md` +**Purpose:** Comprehensive testing strategy - // Wait for LAPI readiness - lapiReady := false - maxWait := 30 * time.Second - pollInterval := 500 * time.Millisecond - deadline := time.Now().Add(maxWait) +#### `/docs/plans/crowdsec_reconciliation_failure.md` +**Purpose:** Troubleshooting guide for startup reconciliation - for time.Now().Before(deadline) { - args := []string{"lapi", "status"} - if _, err := os.Stat(filepath.Join(h.DataDir, "config.yaml")); err == nil { - args = append([]string{"-c", filepath.Join(h.DataDir, "config.yaml")}, args...) - } +#### `/docs/plans/crowdsec_lapi_error_diagnostic.md` +**Purpose:** LAPI connectivity diagnostic plan - checkCtx, cancel := context.WithTimeout(ctx, 2*time.Second) - _, err := h.CmdExec.Execute(checkCtx, "cscli", args...) - cancel() +#### `/docs/plans/crowdsec_toggle_fix_plan.md` +**Purpose:** Toggle fix implementation plan (backed up from current_spec.md) - if err == nil { - lapiReady = true - break - } +### 2. Reports & QA - time.Sleep(pollInterval) - } +#### `/docs/reports/crowdsec_integration_summary.md` +**Purpose:** Summary of CrowdSec integration work - logger.Log().WithFields(map[string]interface{}{ - "pid": pid, - "lapi_ready": lapiReady, - }).Info("CrowdSec toggle: started successfully") +#### `/docs/reports/qa_crowdsec_toggle_fix_summary.md` +**Purpose:** QA report for toggle fix - c.JSON(http.StatusOK, gin.H{ - "enabled": true, - "pid": pid, - "lapi_ready": lapiReady, - }) - return - } else { - // Stop CrowdSec - if err := h.Executor.Stop(ctx, h.DataDir); err != nil { - logger.Log().WithError(err).Error("CrowdSec toggle: failed to stop process") - c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()}) - return - } +#### `/docs/reports/qa_report_crowdsec_architecture.md` +**Purpose:** Architecture review report - logger.Log().Info("CrowdSec toggle: stopped successfully") - c.JSON(http.StatusOK, gin.H{"enabled": false}) - return - } -} -``` +#### `/docs/reports/crowdsec-preset-fix-summary.md` +**Purpose:** Preset system fix report -**Register Route**: -```go -// In RegisterRoutes() method -rg.POST("/admin/crowdsec/toggle", h.ToggleCrowdSec) -``` +#### `/docs/reports/crowdsec_migration_qa_report.md` +**Purpose:** Migration QA report -**Frontend API Client** (`frontend/src/api/crowdsec.ts`): +#### `/docs/reports/qa_report_crowdsec_markdownlint_20251212.md` +**Purpose:** Linting QA report + +#### `/docs/reports/qa_crowdsec_lapi_availability_fix.md` +**Purpose:** LAPI availability fix report + +#### `/docs/reports/crowdsec-preset-pull-apply-debug.md` +**Purpose:** Preset pull/apply debugging report + +#### `/docs/reports/qa_crowdsec_implementation.md` +**Purpose:** Implementation QA report + +### 3. User Documentation + +#### `/docs/troubleshooting/crowdsec.md` +**Purpose:** User troubleshooting guide for CrowdSec + +--- + +## Test Coverage Summary + +### Frontend Tests + +| File | Test Files | Coverage Level | +|------|-----------|----------------| +| `CrowdSecConfig.tsx` | 3 test files (spec, test, coverage) | **High** | +| `ImportCrowdSec.tsx` | 2 test files (spec, test) | **High** | +| `Security.tsx` | 4 test files (spec, test, loading, dashboard) | **High** | +| `api/crowdsec.ts` | 1 test file | **Partial** | +| `api/presets.ts` | **None** | **None** | +| `api/consoleEnrollment.ts` | **None** | **None** | +| `hooks/useSecurity.ts` | 1 test file (partial CrowdSec coverage) | **Partial** | +| `hooks/useConsoleEnrollment.ts` | **None** | **None** | +| `data/crowdsecPresets.ts` | **None** | **None** | +| `utils/crowdsecExport.ts` | **None** | **None** | +| `components/Layout.tsx` | 1 test file (partial) | **Partial** | +| `components/LiveLogViewer.tsx` | 1 test file (includes CrowdSec filter) | **Partial** | + +### Backend Tests + +| File | Test Files | Coverage Level | +|------|-----------|----------------| +| `crowdsec_handler.go` | 10 test files | **Excellent** | +| `crowdsec_exec.go` | 1 test file | **High** | +| `crowdsec_startup.go` | 1 test file | **High** | +| `security_handler.go` | Multiple (includes CrowdSec tests) | **Partial** | +| `models/crowdsec_preset_event.go` | **None** | **None** | +| `models/crowdsec_console_enrollment.go` | **None** | **None** | +| `caddy/config.go` | 1 test file (crowdsec specific) | **High** | +| `caddy/manager.go` | 1 test file (includes CrowdSec toggle) | **High** | + +### Integration Tests + +| File | Purpose | Coverage Level | +|------|---------|----------------| +| `crowdsec_integration_test.go` | Process control integration | **High** | +| `crowdsec_decisions_integration_test.go` | Decision management integration | **High** | +| `crowdsec_pull_apply_integration_test.go` | Preset system integration | **High** | + +### Script Tests + +| Script | Status | +|--------|--------| +| `crowdsec_integration.sh` | Available, manual execution | +| `crowdsec_decision_integration.sh` | Available, manual execution | +| `crowdsec_startup_test.sh` | Available, manual execution | + +--- + +## Testing Gaps + +### Critical (No Tests) + +1. **Frontend:** + - `/frontend/src/api/presets.ts` - Preset API client + - `/frontend/src/api/consoleEnrollment.ts` - Console enrollment API + - `/frontend/src/hooks/useConsoleEnrollment.ts` - Console enrollment hook + - `/frontend/src/data/crowdsecPresets.ts` - Static preset data + - `/frontend/src/utils/crowdsecExport.ts` - Export utilities + +2. **Backend:** + - `/backend/internal/models/crowdsec_preset_event.go` - Audit model + - `/backend/internal/models/crowdsec_console_enrollment.go` - Enrollment model + +### Medium (Partial Coverage) + +1. **Frontend:** + - `api/crowdsec.ts` - Only basic tests, missing edge cases + - `hooks/useSecurity.ts` - useDecisions tests exist, but limited + - `components/Layout.tsx` - CrowdSec navigation only partially tested + +2. **Backend:** + - `security_handler.go` - CrowdSec-related methods have partial coverage + +--- + +## API Endpoint Reference + +### CrowdSec-Specific Endpoints + +| Method | Path | Handler | Purpose | +|--------|------|---------|---------| +| POST | `/admin/crowdsec/start` | `CrowdsecHandler.Start` | Start process | +| POST | `/admin/crowdsec/stop` | `CrowdsecHandler.Stop` | Stop process | +| GET | `/admin/crowdsec/status` | `CrowdsecHandler.Status` | Get status | +| POST | `/admin/crowdsec/import` | `CrowdsecHandler.ImportConfig` | Import config | +| GET | `/admin/crowdsec/export` | `CrowdsecHandler.ExportConfig` | Export config | +| GET | `/admin/crowdsec/files` | `CrowdsecHandler.ListFiles` | List files | +| GET | `/admin/crowdsec/file` | `CrowdsecHandler.ReadFile` | Read file | +| POST | `/admin/crowdsec/file` | `CrowdsecHandler.WriteFile` | Write file | +| GET | `/admin/crowdsec/acquisition` | `CrowdsecHandler.GetAcquisitionConfig` | Get acquis.yaml | +| POST | `/admin/crowdsec/acquisition` | `CrowdsecHandler.UpdateAcquisitionConfig` | Update acquis.yaml | +| GET | `/admin/crowdsec/presets` | `CrowdsecHandler.ListPresets` | List presets | +| POST | `/admin/crowdsec/presets/pull` | `CrowdsecHandler.PullPreset` | Pull preset | +| POST | `/admin/crowdsec/presets/apply` | `CrowdsecHandler.ApplyPreset` | Apply preset | +| GET | `/admin/crowdsec/presets/cache/:slug` | `CrowdsecHandler.GetCachedPreset` | Get cached | +| POST | `/admin/crowdsec/console/enroll` | `CrowdsecHandler.ConsoleEnroll` | Enroll console | +| GET | `/admin/crowdsec/console/status` | `CrowdsecHandler.ConsoleStatus` | Console status | +| GET | `/admin/crowdsec/decisions` | `CrowdsecHandler.ListDecisions` | List (cscli) | +| GET | `/admin/crowdsec/decisions/lapi` | `CrowdsecHandler.GetLAPIDecisions` | List (LAPI) | +| GET | `/admin/crowdsec/lapi/health` | `CrowdsecHandler.CheckLAPIHealth` | LAPI health | +| POST | `/admin/crowdsec/ban` | `CrowdsecHandler.BanIP` | Ban IP | +| DELETE | `/admin/crowdsec/ban/:ip` | `CrowdsecHandler.UnbanIP` | Unban IP | +| POST | `/admin/crowdsec/bouncer/register` | `CrowdsecHandler.RegisterBouncer` | Register bouncer | + +### Security Endpoints (CrowdSec-related) + +| Method | Path | Handler | Purpose | +|--------|------|---------|---------| +| GET | `/security/status` | `SecurityHandler.GetStatus` | Security status (includes CrowdSec) | +| GET | `/security/config` | `SecurityHandler.GetConfig` | Security config (includes CrowdSec) | +| POST | `/security/config` | `SecurityHandler.UpdateConfig` | Update config (can change CrowdSec mode) | +| GET | `/security/decisions` | `SecurityHandler.ListDecisions` | List decisions (includes CrowdSec) | +| POST | `/security/decisions` | `SecurityHandler.CreateDecision` | Create manual decision | + +--- + +## Dependencies & External Integrations + +### External Services +- **CrowdSec Hub API** - `https://hub-api.crowdsec.net/v1/` (preset downloads) +- **CrowdSec Console API** - Enrollment and heartbeat endpoints +- **CrowdSec LAPI** - Local API for decision queries (default: `http://127.0.0.1:8085`) + +### External Binaries +- `crowdsec` - CrowdSec security engine +- `cscli` - CrowdSec command-line interface + +### Caddy Modules +- `http.handlers.crowdsec` - CrowdSec bouncer module for Caddy + +--- + +## Related Features + +### Cerberus Security Suite +CrowdSec is Layer 1 of the Cerberus security suite: +- **Layer 1:** CrowdSec (IP reputation) +- **Layer 2:** WAF (Web Application Firewall) +- **Layer 3:** Rate Limiting +- **Layer 4:** ACL (Access Control Lists) + +All layers are controlled via `SecurityConfig` and can be enabled/disabled independently. + +--- + +## Notes + +1. **Testing Priority:** + - Frontend preset API client needs tests + - Console enrollment needs full test coverage + - Export utilities need tests + +2. **Architecture:** + - CrowdSec runs as a separate process managed by Charon + - Configuration stored in `data/crowdsec/` directory + - Process state reconciled on container restart + - Caddy bouncer communicates with CrowdSec LAPI + +3. **Configuration Flow:** + - User changes mode in UI → `SecurityConfig` updated + - `security.crowdsec.enabled` Setting can override mode + - Caddy config regenerated with bouncer enabled/disabled + - Startup reconciliation ensures process state matches config + +4. **Future Improvements:** + - Add webhook support for CrowdSec alerts + - Implement preset validation before apply + - Add telemetry for preset usage + - Improve LAPI health monitoring + +--- + +## PHASE 2: FRONTEND TEST STRATEGY + +**Generated:** 2025-12-15 +**Goal:** Achieve 100% frontend CrowdSec test coverage and expose existing bugs +**Context:** Last 13 commits were CrowdSec hotfixes addressing: +- Toggle state mismatch after restart +- LAPI readiness/availability issues +- 500 errors on startup +- Database migration and verification failures +- Post-rebuild state sync issues + +### Priority Order + +**Phase 2A: API Clients (Foundation Layer)** +1. `frontend/src/api/presets.ts` +2. `frontend/src/api/consoleEnrollment.ts` + +**Phase 2B: Data & Utilities** +3. `frontend/src/data/crowdsecPresets.ts` +4. `frontend/src/utils/crowdsecExport.ts` + +**Phase 2C: React Query Hooks** +5. `frontend/src/hooks/useConsoleEnrollment.ts` + +**Phase 2D: Integration Tests** +6. Cross-component integration scenarios + +--- + +### PHASE 2A: API Client Tests + +#### 1. `/frontend/src/api/__tests__/presets.test.ts` + +**Purpose:** Test preset API client with focus on caching and error handling + +**Test Scenarios:** + +##### Basic Happy Path ```typescript -export async function toggleCrowdsec(enabled: boolean): Promise<{ enabled: boolean; pid?: number; lapi_ready?: boolean }> { - const response = await client.post('/admin/crowdsec/toggle', { enabled }) - return response.data +describe('listCrowdsecPresets', () => { + it('should fetch presets list with cached flags', async () => { + // Mock: API returns presets with various cache states + const mockPresets = [ + { slug: 'bot-mitigation', cached: true, cache_key: 'abc123' }, + { slug: 'honeypot', cached: false } + ] + // Assert: Proper GET call to /admin/crowdsec/presets + // Assert: Returns data.presets array + }) +}) + +describe('pullCrowdsecPreset', () => { + it('should pull preset and return preview with cache_key', async () => { + // Mock: API returns preview content and cache metadata + const mockResponse = { + status: 'success', + slug: 'bot-mitigation', + preview: '# Config content...', + cache_key: 'xyz789', + etag: '"abc"', + retrieved_at: '2025-12-15T10:00:00Z' + } + // Assert: POST to /admin/crowdsec/presets/pull with { slug } + }) +}) + +describe('applyCrowdsecPreset', () => { + it('should apply preset with cache_key when available', async () => { + // Mock: API returns success with backup path + const payload = { slug: 'bot-mitigation', cache_key: 'xyz789' } + const mockResponse = { + status: 'success', + backup: '/data/backups/preset-backup-123.tar.gz', + reload_hint: true + } + // Assert: POST to /admin/crowdsec/presets/apply + }) + + it('should apply preset without cache_key (fallback mode)', async () => { + // Test scenario: User applies preset without pulling first + const payload = { slug: 'bot-mitigation' } + // Assert: Backend should fetch from hub API directly + }) +}) + +describe('getCrowdsecPresetCache', () => { + it('should fetch cached preset preview', async () => { + // Mock: Returns previously cached preview + const mockCache = { + preview: '# Cached content...', + cache_key: 'xyz789', + etag: '"abc"' + } + // Assert: GET to /admin/crowdsec/presets/cache/:slug with URL encoding + }) +}) +``` + +##### Edge Cases & Bug Exposure + +**🐛 BUG TARGET: Cache Key Mismatch** +```typescript +it('should handle stale cache_key gracefully', async () => { + // Scenario: User pulls preset, backend updates cache, user applies with old key + // Expected: Backend should detect mismatch and re-fetch or error clearly + // Current Bug: May apply wrong version silently + const stalePayload = { slug: 'bot-mitigation', cache_key: 'old_key_123' } + // Assert: Should return error or warning about stale cache +}) +``` + +**🐛 BUG TARGET: Network Failures During Pull** +```typescript +it('should handle hub API timeout during pull', async () => { + // Scenario: CrowdSec Hub API is slow or unreachable + // Mock: API returns 504 Gateway Timeout + // Assert: Client should throw descriptive error + // Assert: Should NOT cache partial/failed response +}) + +it('should handle ETAG validation failure', async () => { + // Scenario: Hub API returns 304 Not Modified but local cache missing + // Expected: Should re-fetch or error clearly +}) +``` + +**🐛 BUG TARGET: Apply Without CrowdSec Running** +```typescript +it('should error when applying preset with CrowdSec stopped', async () => { + // Scenario: User tries to apply preset but CrowdSec process is not running + // Mock: API returns error about cscli being unavailable + // Assert: Clear error message about starting CrowdSec first +}) +``` + +**Mock Strategies:** +```typescript +// Mock client.get/post with vi.mock +vi.mock('../client') + +// Mock data structures +const mockPresetList = { + presets: [ + { + slug: 'bot-mitigation-essentials', + title: 'Bot Mitigation Essentials', + summary: 'Core HTTP parsers...', + source: 'hub', + tags: ['bots', 'web'], + requires_hub: true, + available: true, + cached: false + } + ] +} + +const mockPullResponse = { + status: 'success', + slug: 'bot-mitigation-essentials', + preview: 'configs:\n collections:\n - crowdsecurity/base-http-scenarios', + cache_key: 'abc123def456', + etag: '"w/12345"', + retrieved_at: new Date().toISOString(), + source: 'hub' } ``` -**Frontend Toggle Update** (`frontend/src/pages/Security.tsx`): -```tsx -const crowdsecPowerMutation = useMutation({ - mutationFn: async (enabled: boolean) => { - if (enabled) { - toast.info('Starting CrowdSec... This may take up to 30 seconds') - } +--- - // Use unified toggle endpoint (handles Settings + SecurityConfig + Process) - const result = await toggleCrowdsec(enabled) +#### 2. `/frontend/src/api/__tests__/consoleEnrollment.test.ts` - // Backend already verified state, just do final status check - const status = await statusCrowdsec() - if (enabled && !status.running) { - throw new Error('CrowdSec process failed to start. Check server logs for details.') - } - if (!enabled && status.running) { - throw new Error('CrowdSec process still running. Check server logs for details.') - } +**Purpose:** Test CrowdSec Console enrollment API with security focus - return result +**Test Scenarios:** + +##### Basic Operations +```typescript +describe('getConsoleStatus', () => { + it('should fetch enrollment status with pending state', async () => { + const mockStatus = { + status: 'pending', + tenant: 'my-org', + agent_name: 'charon-prod', + key_present: true, + last_attempt_at: '2025-12-15T09:00:00Z' + } + // Assert: GET to /admin/crowdsec/console/status + }) + + it('should fetch enrolled status with heartbeat', async () => { + const mockStatus = { + status: 'enrolled', + tenant: 'my-org', + agent_name: 'charon-prod', + key_present: true, + enrolled_at: '2025-12-14T10:00:00Z', + last_heartbeat_at: '2025-12-15T09:55:00Z' + } + // Assert: Shows successful enrollment state + }) + + it('should fetch failed status with error message', async () => { + const mockStatus = { + status: 'failed', + tenant: 'my-org', + agent_name: 'charon-prod', + key_present: false, + last_error: 'Invalid enrollment key', + last_attempt_at: '2025-12-15T09:00:00Z', + correlation_id: 'req-abc123' + } + // Assert: Error details are surfaced for debugging + }) +}) + +describe('enrollConsole', () => { + it('should enroll with valid payload', async () => { + const payload = { + enrollment_key: 'cs-enroll-abc123xyz', + tenant: 'my-org', + agent_name: 'charon-prod', + force: false + } + // Mock: Returns enrolled status + // Assert: POST to /admin/crowdsec/console/enroll + }) + + it('should force re-enrollment when force=true', async () => { + const payload = { + enrollment_key: 'cs-enroll-new-key', + agent_name: 'charon-updated', + force: true + } + // Assert: Overwrites existing enrollment + }) +}) +``` + +##### Security & Error Cases + +**🐛 BUG TARGET: Enrollment Key Exposure** +```typescript +it('should NOT return enrollment key in status response', async () => { + // Security test: Ensure key is never exposed in API responses + const mockStatus = await getConsoleStatus() + expect(mockStatus).not.toHaveProperty('enrollment_key') + expect(mockStatus).not.toHaveProperty('encrypted_enroll_key') + // Only key_present boolean should be exposed +}) +``` + +**🐛 BUG TARGET: Enrollment Retry Logic** +```typescript +it('should handle transient network errors during enrollment', async () => { + // Scenario: CrowdSec Console API temporarily unavailable + // Mock: First call fails with network error, second succeeds + // Assert: Should NOT mark as permanently failed + // Assert: Should retry on next status poll or manual retry +}) + +it('should handle invalid enrollment key format', async () => { + // Scenario: User pastes malformed key + const payload = { + enrollment_key: 'not-a-valid-key', + agent_name: 'test' + } + // Mock: API returns 400 Bad Request + // Assert: Clear validation error message +}) +``` + +**🐛 BUG TARGET: Tenant Name Validation** +```typescript +it('should sanitize tenant name with special characters', async () => { + // Scenario: Tenant name has spaces/special chars + const payload = { + enrollment_key: 'valid-key', + tenant: 'My Org (Production)', // Invalid chars + agent_name: 'agent1' + } + // Expected: Backend should sanitize or reject + // Assert: Should not cause SQL injection or path traversal +}) +``` + +**Mock Strategies:** +```typescript +const mockEnrollmentStatuses = { + pending: { + status: 'pending', + key_present: true, + last_attempt_at: new Date().toISOString() }, - // ... rest remains the same + enrolled: { + status: 'enrolled', + tenant: 'test-tenant', + agent_name: 'test-agent', + key_present: true, + enrolled_at: new Date().toISOString(), + last_heartbeat_at: new Date().toISOString() + }, + failed: { + status: 'failed', + key_present: false, + last_error: 'Enrollment key expired', + correlation_id: 'err-123' + } +} +``` + +--- + +### PHASE 2B: Data & Utility Tests + +#### 3. `/frontend/src/data/__tests__/crowdsecPresets.test.ts` + +**Purpose:** Test static preset definitions and lookup logic + +**Test Scenarios:** + +```typescript +describe('CROWDSEC_PRESETS', () => { + it('should contain all expected presets', () => { + expect(CROWDSEC_PRESETS).toHaveLength(3) + expect(CROWDSEC_PRESETS.map(p => p.slug)).toEqual([ + 'bot-mitigation-essentials', + 'honeypot-friendly-defaults', + 'geolocation-aware' + ]) + }) + + it('should have valid YAML content for each preset', () => { + CROWDSEC_PRESETS.forEach(preset => { + expect(preset.content).toContain('configs:') + expect(preset.content).toMatch(/collections:|parsers:|scenarios:/) + }) + }) + + it('should have required metadata fields', () => { + CROWDSEC_PRESETS.forEach(preset => { + expect(preset).toHaveProperty('slug') + expect(preset).toHaveProperty('title') + expect(preset).toHaveProperty('description') + expect(preset).toHaveProperty('content') + expect(preset.slug).toMatch(/^[a-z0-9-]+$/) // Slug format validation + }) + }) + + it('should have warnings for production-critical presets', () => { + const botMitigation = CROWDSEC_PRESETS.find(p => p.slug === 'bot-mitigation-essentials') + expect(botMitigation?.warning).toBeDefined() + expect(botMitigation?.warning).toContain('allowlist') + }) +}) + +describe('findCrowdsecPreset', () => { + it('should find preset by slug', () => { + const preset = findCrowdsecPreset('bot-mitigation-essentials') + expect(preset).toBeDefined() + expect(preset?.slug).toBe('bot-mitigation-essentials') + }) + + it('should return undefined for non-existent slug', () => { + const preset = findCrowdsecPreset('non-existent-preset') + expect(preset).toBeUndefined() + }) + + it('should be case-sensitive', () => { + const preset = findCrowdsecPreset('BOT-MITIGATION-ESSENTIALS') + expect(preset).toBeUndefined() + }) +}) +``` + +**🐛 BUG TARGET: Preset Content Validation** +```typescript +describe('preset content integrity', () => { + it('should have valid CrowdSec YAML structure', () => { + // Test that content can be parsed as YAML + CROWDSEC_PRESETS.forEach(preset => { + expect(() => { + // Simple validation: check for basic structure + const lines = preset.content.split('\n') + expect(lines[0]).toMatch(/^configs:/) + }).not.toThrow() + }) + }) + + it('should reference valid CrowdSec hub items', () => { + // Validate collection/parser/scenario names follow hub naming conventions + CROWDSEC_PRESETS.forEach(preset => { + const collections = preset.content.match(/- crowdsecurity\/[\w-]+/g) || [] + collections.forEach(item => { + expect(item).toMatch(/^- crowdsecurity\/[a-z0-9-]+$/) + }) + }) + }) }) ``` --- -## Testing Plan +#### 4. `/frontend/src/utils/__tests__/crowdsecExport.test.ts` -### Test 1: Fresh Install +**Purpose:** Test export filename generation and download utilities -**Scenario**: Brand new Charon installation +**Test Scenarios:** -1. Start container: `docker compose up -d` -2. Navigate to Security page -3. Verify CrowdSec toggle shows OFF -4. Check status: `curl http://localhost:8080/api/v1/admin/crowdsec/status` - - Expected: `{"running": false}` -5. Check logs: `docker logs charon 2>&1 | grep "reconciliation"` - - Expected: "no SecurityConfig found, checking Settings table" - - Expected: "default SecurityConfig created from Settings preference" - - Expected: "crowdsec_mode: disabled" +```typescript +describe('buildCrowdsecExportFilename', () => { + it('should generate filename with ISO timestamp', () => { + const filename = buildCrowdsecExportFilename() + expect(filename).toMatch(/^crowdsec-export-\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}.*\.tar\.gz$/) + }) -### Test 2: Toggle ON → Container Restart + it('should replace colons with hyphens in timestamp', () => { + const filename = buildCrowdsecExportFilename() + expect(filename).not.toContain(':') + }) -**Scenario**: User enables CrowdSec, then restarts container + it('should always end with .tar.gz', () => { + const filename = buildCrowdsecExportFilename() + expect(filename).toEndWith('.tar.gz') + }) +}) -1. Enable toggle in UI (click ON) -2. Verify CrowdSec starts -3. Check status: `{"running": true, "pid": xxx}` -4. Restart: `docker restart charon` -5. Wait 10 seconds -6. Check status again: `{"running": true, "pid": xxx}` (NEW PID) -7. Check logs: - - Expected: "starting based on SecurityConfig mode='local'" +describe('promptCrowdsecFilename', () => { + beforeEach(() => { + vi.stubGlobal('prompt', vi.fn()) + }) -### Test 3: Legacy Migration (Settings Table Only) + afterEach(() => { + vi.unstubAllGlobals() + }) -**Scenario**: Existing install with Settings table but no SecurityConfig + it('should return default filename when user cancels', () => { + vi.mocked(window.prompt).mockReturnValue(null) + const result = promptCrowdsecFilename('default.tar.gz') + expect(result).toBeNull() + }) -1. Manually set: `INSERT INTO settings (key, value, type, category) VALUES ('security.crowdsec.enabled', 'true', 'bool', 'security');` -2. Delete SecurityConfig: `DELETE FROM security_configs;` -3. Restart container -4. Check logs: - - Expected: "found existing Settings table preference" - - Expected: "default SecurityConfig created from Settings preference" - - Expected: "crowdsec_mode: local" -5. Check status: `{"running": true}` + it('should sanitize user input by replacing slashes', () => { + vi.mocked(window.prompt).mockReturnValue('backup/prod/config') + const result = promptCrowdsecFilename() + expect(result).toBe('backup-prod-config.tar.gz') + }) -### Test 4: Toggle OFF → Container Restart + it('should replace spaces with hyphens', () => { + vi.mocked(window.prompt).mockReturnValue('crowdsec backup 2025') + const result = promptCrowdsecFilename() + expect(result).toBe('crowdsec-backup-2025.tar.gz') + }) -**Scenario**: User disables CrowdSec, then restarts container + it('should append .tar.gz if missing', () => { + vi.mocked(window.prompt).mockReturnValue('my-backup') + const result = promptCrowdsecFilename() + expect(result).toBe('my-backup.tar.gz') + }) -1. Start with CrowdSec enabled and running -2. Click toggle OFF in UI -3. Verify process stops -4. Restart: `docker restart charon` -5. Wait 10 seconds -6. Check status: `{"running": false}` -7. Verify toggle still shows OFF + it('should not double-append .tar.gz', () => { + vi.mocked(window.prompt).mockReturnValue('my-backup.tar.gz') + const result = promptCrowdsecFilename() + expect(result).toBe('my-backup.tar.gz') + }) -### Test 5: Corrupted SecurityConfig Recovery + it('should handle empty string by using default', () => { + vi.mocked(window.prompt).mockReturnValue(' ') + const result = promptCrowdsecFilename('default.tar.gz') + expect(result).toBe('default.tar.gz') + }) +}) -**Scenario**: SecurityConfig gets deleted but Settings exists +describe('downloadCrowdsecExport', () => { + let createObjectURLSpy: ReturnType + let revokeObjectURLSpy: ReturnType + let clickSpy: ReturnType -1. Enable CrowdSec via UI -2. Manually delete SecurityConfig: `DELETE FROM security_configs;` -3. Restart container -4. Verify auto-init recreates SecurityConfig matching Settings table -5. Verify CrowdSec auto-starts + beforeEach(() => { + createObjectURLSpy = vi.fn(() => 'blob:mock-url') + revokeObjectURLSpy = vi.fn() + vi.stubGlobal('URL', { + createObjectURL: createObjectURLSpy, + revokeObjectURL: revokeObjectURLSpy + }) + + clickSpy = vi.fn() + vi.spyOn(document, 'createElement').mockImplementation((tag) => { + if (tag === 'a') { + return { + click: clickSpy, + remove: vi.fn(), + href: '', + download: '' + } as any + } + return document.createElement(tag) + }) + + vi.spyOn(document.body, 'appendChild').mockImplementation(() => null as any) + }) + + afterEach(() => { + vi.unstubAllGlobals() + vi.restoreAllMocks() + }) + + it('should create blob URL and trigger download', () => { + const blob = new Blob(['test data'], { type: 'application/gzip' }) + downloadCrowdsecExport(blob, 'test.tar.gz') + + expect(createObjectURLSpy).toHaveBeenCalled() + expect(clickSpy).toHaveBeenCalled() + expect(revokeObjectURLSpy).toHaveBeenCalledWith('blob:mock-url') + }) + + it('should set correct filename on anchor element', () => { + const blob = new Blob(['data']) + const createElementSpy = vi.spyOn(document, 'createElement') + + downloadCrowdsecExport(blob, 'my-backup.tar.gz') + + const anchorCall = createElementSpy.mock.results.find( + result => result.value.tagName === 'A' + ) + // Note: Detailed assertion requires DOM manipulation mocking + }) + + it('should clean up by removing anchor element', () => { + const blob = new Blob(['data']) + const removeSpy = vi.fn() + vi.spyOn(document, 'createElement').mockReturnValue({ + click: vi.fn(), + remove: removeSpy, + href: '', + download: '' + } as any) + + downloadCrowdsecExport(blob, 'test.tar.gz') + + expect(removeSpy).toHaveBeenCalled() + }) +}) +``` + +**🐛 BUG TARGET: Path Traversal in Filename** +```typescript +describe('security: path traversal prevention', () => { + it('should sanitize directory traversal attempts', () => { + vi.mocked(window.prompt).mockReturnValue('../../etc/passwd') + const result = promptCrowdsecFilename() + expect(result).toBe('.....-...-etc-passwd.tar.gz') + expect(result).not.toContain('../') + }) + + it('should handle absolute paths', () => { + vi.mocked(window.prompt).mockReturnValue('/etc/crowdsec/backup') + const result = promptCrowdsecFilename() + expect(result).not.toMatch(/^\//) + }) +}) +``` --- -## Verification Checklist +### PHASE 2C: React Query Hook Tests -### Phase 1 (Auto-Initialization Fix) +#### 5. `/frontend/src/hooks/__tests__/useConsoleEnrollment.test.tsx` -- [ ] Modified `crowdsec_startup.go` lines 46-71 -- [ ] Auto-init checks Settings table for existing preference -- [ ] Auto-init creates SecurityConfig matching Settings state -- [ ] Auto-init does NOT return early (continues to line 74+) -- [ ] Test 1 (Fresh Install) passes -- [ ] Test 3 (Legacy Migration) passes +**Purpose:** Test React Query hooks for CrowdSec Console enrollment -### Phase 2 (Logging Enhancement) +**Test Scenarios:** -- [ ] Modified `crowdsec_startup.go` lines 91-98 -- [ ] Changed log level from Debug to Info -- [ ] Added source attribution logging -- [ ] Test 2 (Toggle ON → Restart) shows correct log -- [ ] Test 4 (Toggle OFF → Restart) shows correct log +```typescript +import { describe, it, expect, vi, beforeEach } from 'vitest' +import { renderHook, waitFor } from '@testing-library/react' +import { QueryClient, QueryClientProvider } from '@tanstack/react-query' +import { useConsoleStatus, useEnrollConsole } from '../useConsoleEnrollment' +import * as consoleEnrollmentApi from '../../api/consoleEnrollment' -### Phase 3 (Unified Toggle - Optional) +vi.mock('../../api/consoleEnrollment') -- [ ] Added `ToggleCrowdSec()` method to `crowdsec_handler.go` -- [ ] Registered `/admin/crowdsec/toggle` route -- [ ] Added `toggleCrowdsec()` to `crowdsec.ts` -- [ ] Updated `crowdsecPowerMutation` in `Security.tsx` -- [ ] Test 4 (Toggle synchronization) passes -- [ ] Test 5 (Corrupted recovery) passes +describe('useConsoleEnrollment hooks', () => { + let queryClient: QueryClient -### Pre-Deployment + beforeEach(() => { + queryClient = new QueryClient({ + defaultOptions: { + queries: { retry: false }, + mutations: { retry: false } + } + }) + vi.clearAllMocks() + }) -- [ ] Pre-commit linters pass: `pre-commit run --all-files` -- [ ] Backend tests pass: `cd backend && go test ./...` -- [ ] Frontend tests pass: `cd frontend && npm run test` -- [ ] Docker build succeeds: `docker build -t charon:local .` -- [ ] Integration test passes: `scripts/crowdsec_integration.sh` + const wrapper = ({ children }: { children: React.ReactNode }) => ( + {children} + ) + + describe('useConsoleStatus', () => { + it('should fetch console enrollment status when enabled', async () => { + const mockStatus = { + status: 'enrolled', + tenant: 'test-org', + agent_name: 'charon-1', + key_present: true, + enrolled_at: '2025-12-14T10:00:00Z', + last_heartbeat_at: '2025-12-15T09:00:00Z' + } + vi.mocked(consoleEnrollmentApi.getConsoleStatus).mockResolvedValue(mockStatus) + + const { result } = renderHook(() => useConsoleStatus(), { wrapper }) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + expect(result.current.data).toEqual(mockStatus) + expect(consoleEnrollmentApi.getConsoleStatus).toHaveBeenCalledTimes(1) + }) + + it('should NOT fetch when enabled=false', async () => { + const { result } = renderHook(() => useConsoleStatus(false), { wrapper }) + + await waitFor(() => expect(result.current.isLoading).toBe(false)) + expect(consoleEnrollmentApi.getConsoleStatus).not.toHaveBeenCalled() + expect(result.current.data).toBeUndefined() + }) + + it('should use correct query key for invalidation', () => { + renderHook(() => useConsoleStatus(), { wrapper }) + const queries = queryClient.getQueryCache().getAll() + const consoleQuery = queries.find(q => + JSON.stringify(q.queryKey) === JSON.stringify(['crowdsec-console-status']) + ) + expect(consoleQuery).toBeDefined() + }) + }) + + describe('useEnrollConsole', () => { + it('should enroll console and invalidate status query', async () => { + const mockResponse = { + status: 'enrolled', + tenant: 'my-org', + agent_name: 'charon-prod', + key_present: true, + enrolled_at: new Date().toISOString() + } + vi.mocked(consoleEnrollmentApi.enrollConsole).mockResolvedValue(mockResponse) + + const { result } = renderHook(() => useEnrollConsole(), { wrapper }) + + const payload = { + enrollment_key: 'cs-enroll-key-123', + tenant: 'my-org', + agent_name: 'charon-prod' + } + + result.current.mutate(payload) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + expect(consoleEnrollmentApi.enrollConsole).toHaveBeenCalledWith(payload) + expect(result.current.data).toEqual(mockResponse) + }) + + it('should invalidate console status query on success', async () => { + const mockResponse = { status: 'enrolled', key_present: true } + vi.mocked(consoleEnrollmentApi.enrollConsole).mockResolvedValue(mockResponse) + + // Set up initial status query + queryClient.setQueryData(['crowdsec-console-status'], { status: 'pending' }) + + const { result } = renderHook(() => useEnrollConsole(), { wrapper }) + + result.current.mutate({ + enrollment_key: 'key', + agent_name: 'agent' + }) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + // Verify invalidation happened + const state = queryClient.getQueryState(['crowdsec-console-status']) + expect(state?.isInvalidated).toBe(true) + }) + + it('should handle enrollment errors', async () => { + const error = new Error('Invalid enrollment key') + vi.mocked(consoleEnrollmentApi.enrollConsole).mockRejectedValue(error) + + const { result } = renderHook(() => useEnrollConsole(), { wrapper }) + + result.current.mutate({ + enrollment_key: 'invalid', + agent_name: 'test' + }) + + await waitFor(() => expect(result.current.isError).toBe(true)) + expect(result.current.error).toEqual(error) + }) + }) +}) +``` + +**🐛 BUG TARGET: Polling and Stale Data** +```typescript +describe('polling behavior', () => { + it('should refetch status on window focus', async () => { + vi.mocked(consoleEnrollmentApi.getConsoleStatus).mockResolvedValue({ + status: 'pending', + key_present: true + }) + + const { result } = renderHook(() => useConsoleStatus(), { wrapper }) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + // Simulate window focus + window.dispatchEvent(new Event('focus')) + + // Should trigger refetch + await waitFor(() => { + expect(consoleEnrollmentApi.getConsoleStatus).toHaveBeenCalledTimes(2) + }) + }) + + it('should NOT poll when enrollment is complete', async () => { + // Scenario: Avoid unnecessary API calls after successful enrollment + vi.mocked(consoleEnrollmentApi.getConsoleStatus).mockResolvedValue({ + status: 'enrolled', + enrolled_at: '2025-12-15T10:00:00Z', + key_present: true + }) + + const { result } = renderHook(() => useConsoleStatus(), { wrapper }) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + // Wait and verify no additional calls + await new Promise(resolve => setTimeout(resolve, 100)) + expect(consoleEnrollmentApi.getConsoleStatus).toHaveBeenCalledTimes(1) + }) +}) +``` + +**🐛 BUG TARGET: Race Conditions** +```typescript +describe('concurrent enrollment attempts', () => { + it('should handle multiple enrollment mutations gracefully', async () => { + // Scenario: User clicks enroll button multiple times rapidly + const mockResponse = { status: 'enrolled', key_present: true } + vi.mocked(consoleEnrollmentApi.enrollConsole).mockResolvedValue(mockResponse) + + const { result } = renderHook(() => useEnrollConsole(), { wrapper }) + + // Trigger multiple mutations + result.current.mutate({ enrollment_key: 'key1', agent_name: 'agent' }) + result.current.mutate({ enrollment_key: 'key2', agent_name: 'agent' }) + + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + // Only the last mutation should be active + expect(consoleEnrollmentApi.enrollConsole).toHaveBeenCalledWith( + expect.objectContaining({ enrollment_key: 'key2' }) + ) + }) +}) +``` --- -## Success Criteria +### PHASE 2D: Integration Tests -✅ **Fix is complete when**: +#### 6. Cross-Component Integration Testing -1. Toggle shows correct state (ON = running, OFF = stopped) -2. Toggle persists across container restarts -3. Reconciliation logs clearly show decision reason -4. Auto-initialization respects Settings table preference -5. No "stuck toggle" scenarios -6. All 5 test cases pass -7. Pre-commit checks pass -8. No regressions in existing CrowdSec functionality +**File:** `/frontend/src/__tests__/integration/crowdsec-preset-flow.test.tsx` + +**Purpose:** Test complete preset selection and application workflow + +**Test Scenarios:** + +```typescript +describe('CrowdSec Preset Integration Flow', () => { + it('should complete full preset workflow: list → pull → preview → apply', async () => { + // 1. List presets + const presets = await listCrowdsecPresets() + expect(presets.presets).toHaveLength(3) + + // 2. Pull specific preset + const pullResult = await pullCrowdsecPreset('bot-mitigation-essentials') + expect(pullResult.cache_key).toBeDefined() + + // 3. Preview cached preset + const cachedPreset = await getCrowdsecPresetCache('bot-mitigation-essentials') + expect(cachedPreset.preview).toEqual(pullResult.preview) + expect(cachedPreset.cache_key).toEqual(pullResult.cache_key) + + // 4. Apply preset + const applyResult = await applyCrowdsecPreset({ + slug: 'bot-mitigation-essentials', + cache_key: pullResult.cache_key + }) + expect(applyResult.status).toBe('success') + expect(applyResult.backup).toBeDefined() + }) + + it('should handle preset application without prior pull (direct mode)', async () => { + // Scenario: User applies preset from static list without pulling first + const applyResult = await applyCrowdsecPreset({ + slug: 'honeypot-friendly-defaults' + // No cache_key provided + }) + expect(applyResult.status).toBe('success') + expect(applyResult.used_cscli).toBe(true) // Backend fetched directly + }) +}) +``` + +**File:** `/frontend/src/__tests__/integration/crowdsec-console-enrollment.test.tsx` + +**Purpose:** Test console enrollment UI flow with hooks + +```typescript +describe('CrowdSec Console Enrollment Integration', () => { + it('should complete enrollment flow with status updates', async () => { + const { result: statusHook } = renderHook(() => useConsoleStatus(), { wrapper }) + const { result: enrollHook } = renderHook(() => useEnrollConsole(), { wrapper }) + + // Initial status: not enrolled + await waitFor(() => expect(statusHook.current.data?.status).toBe('none')) + + // Trigger enrollment + enrollHook.current.mutate({ + enrollment_key: 'cs-enroll-valid-key', + tenant: 'test-org', + agent_name: 'charon-test' + }) + + await waitFor(() => expect(enrollHook.current.isSuccess).toBe(true)) + + // Status should update to enrolled + await waitFor(() => { + expect(statusHook.current.data?.status).toBe('enrolled') + expect(statusHook.current.data?.tenant).toBe('test-org') + }) + }) +}) +``` + +**File:** `/frontend/src/__tests__/integration/crowdsec-api-hook-consistency.test.tsx` + +**Purpose:** Verify API client and hook return consistent data shapes + +```typescript +describe('API-Hook Consistency', () => { + it('should return consistent data shapes between API and hooks', async () => { + // Test: Verify that useConsoleStatus returns same shape as getConsoleStatus + const apiResponse = await consoleEnrollmentApi.getConsoleStatus() + + const { result } = renderHook(() => useConsoleStatus(), { wrapper }) + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + expect(result.current.data).toMatchObject(apiResponse) + expect(Object.keys(result.current.data!)).toEqual(Object.keys(apiResponse)) + }) + + it('should handle API errors consistently across hooks', async () => { + // Test: Error shape is preserved through React Query + const apiError = new Error('Network failure') + vi.mocked(consoleEnrollmentApi.getConsoleStatus).mockRejectedValue(apiError) + + const { result } = renderHook(() => useConsoleStatus(), { wrapper }) + await waitFor(() => expect(result.current.isError).toBe(true)) + + expect(result.current.error).toEqual(apiError) + }) +}) +``` --- -## Risk Assessment +### BUG EXPOSURE STRATEGY -| Change | Risk Level | Mitigation | -|--------|------------|------------| -| Phase 1 (Auto-init) | **Low** | Only affects fresh installs or corrupted state recovery | -| Phase 2 (Logging) | **Very Low** | Only changes log output, no logic changes | -| Phase 3 (Unified toggle) | **Medium** | New endpoint, requires thorough testing, but backward compatible | +**Based on Recent Hotfixes:** + +#### 🐛 Focus Area 1: Toggle State Mismatch After Restart +**Files to Test:** `useSecurityStatus`, `Security.tsx`, preset application +**Test Strategy:** +- Simulate container restart (clear React Query cache) +- Verify that CrowdSec toggle state matches backend SecurityConfig +- Test that LAPI availability doesn't cause UI state mismatch +- Check that preset application doesn't break toggle state + +**Specific Test:** +```typescript +it('should sync toggle state after simulated restart', async () => { + // 1. Set CrowdSec to enabled + queryClient.setQueryData(['security-status'], { crowdsec: { enabled: true } }) + + // 2. Clear cache (simulate restart) + queryClient.clear() + + // 3. Refetch status + const { result } = renderHook(() => useSecurityStatus(), { wrapper }) + await waitFor(() => expect(result.current.isSuccess).toBe(true)) + + // 4. Verify state is still correct + expect(result.current.data.crowdsec.enabled).toBe(true) +}) +``` + +#### 🐛 Focus Area 2: LAPI Readiness/Availability Issues +**Files to Test:** `api/crowdsec.ts`, `statusCrowdsec()`, decision endpoints +**Test Strategy:** +- Test scenarios where CrowdSec process is running but LAPI not ready +- Verify error handling when LAPI returns 503/504 +- Test decision fetching when LAPI is unavailable +- Ensure UI doesn't show stale decision data + +**Specific Test:** +```typescript +it('should handle LAPI not ready when fetching decisions', async () => { + // Mock: CrowdSec running but LAPI not ready + vi.mocked(client.get).mockRejectedValueOnce({ + response: { status: 503, data: { error: 'LAPI not ready' } } + }) + + await expect(listCrowdsecDecisions()).rejects.toThrow() + // UI should show loading state, not error +}) +``` + +#### 🐛 Focus Area 3: Preset Cache Invalidation +**Files to Test:** `api/presets.ts`, `getCrowdsecPresetCache()` +**Test Strategy:** +- Test stale cache key scenarios +- Verify that applying preset invalidates cache properly +- Check that cache survives page refresh but not backend restart +- Test cache hit/miss logic + +**Specific Test:** +```typescript +it('should invalidate cache after preset application', async () => { + // 1. Pull preset (populates cache) + const pullResult = await pullCrowdsecPreset('bot-mitigation-essentials') + const cacheKey1 = pullResult.cache_key + + // 2. Apply preset + await applyCrowdsecPreset({ slug: 'bot-mitigation-essentials', cache_key: cacheKey1 }) + + // 3. Pull again (should get fresh data, new cache key) + const pullResult2 = await pullCrowdsecPreset('bot-mitigation-essentials') + expect(pullResult2.cache_key).not.toBe(cacheKey1) +}) +``` + +#### 🐛 Focus Area 4: Console Enrollment Retry Logic +**Files to Test:** `useConsoleEnrollment.ts`, `api/consoleEnrollment.ts` +**Test Strategy:** +- Test enrollment failure followed by retry +- Verify that transient errors don't mark enrollment as permanently failed +- Check that correlation_id is tracked for debugging +- Ensure enrollment key is never logged or exposed + +**Specific Test:** +```typescript +it('should allow retry after transient enrollment failure', async () => { + const { result } = renderHook(() => useEnrollConsole(), { wrapper }) + + // First attempt fails + vi.mocked(consoleEnrollmentApi.enrollConsole).mockRejectedValueOnce( + new Error('Network timeout') + ) + result.current.mutate({ enrollment_key: 'key', agent_name: 'agent' }) + await waitFor(() => expect(result.current.isError).toBe(true)) + + // Second attempt succeeds + vi.mocked(consoleEnrollmentApi.enrollConsole).mockResolvedValueOnce({ + status: 'enrolled', + key_present: true + }) + result.current.mutate({ enrollment_key: 'key', agent_name: 'agent' }) + await waitFor(() => expect(result.current.isSuccess).toBe(true)) +}) +``` --- -## Rollback Plan +### MOCK DATA STRUCTURES -If issues arise: +**Comprehensive Mock Library:** -1. **Immediate Revert**: `git revert ` (no DB changes needed) -2. **Manual Fix** (if toggle stuck): - ```sql - -- Reset SecurityConfig - UPDATE security_configs - SET crowdsec_mode = 'disabled', enabled = 0 - WHERE uuid = 'default'; +```typescript +// frontend/src/__tests__/mocks/crowdsecMocks.ts - -- Reset Settings - UPDATE settings - SET value = 'false' - WHERE key = 'security.crowdsec.enabled'; - ``` -3. **Force Stop CrowdSec**: `docker exec charon pkill -SIGTERM crowdsec` +export const mockPresets = { + list: { + presets: [ + { + slug: 'bot-mitigation-essentials', + title: 'Bot Mitigation Essentials', + summary: 'Core HTTP parsers and scenarios', + source: 'hub', + tags: ['bots', 'web', 'auth'], + requires_hub: true, + available: true, + cached: false + }, + { + slug: 'honeypot-friendly-defaults', + title: 'Honeypot Friendly Defaults', + summary: 'Lightweight defaults for honeypots', + source: 'builtin', + tags: ['low-noise', 'ssh'], + requires_hub: false, + available: true, + cached: true, + cache_key: 'builtin-honeypot-123' + }, + { + slug: 'geolocation-aware', + title: 'Geolocation Aware', + summary: 'Geo-enrichment and region-aware scenarios', + source: 'hub', + tags: ['geo', 'access-control'], + requires_hub: true, + available: false, // Not available (requires GeoIP DB) + cached: false + } + ] + }, + + pullResponse: { + success: { + status: 'success', + slug: 'bot-mitigation-essentials', + preview: 'configs:\n collections:\n - crowdsecurity/base-http-scenarios', + cache_key: 'hub-bot-mitigation-abc123', + etag: '"w/12345-abcdef"', + retrieved_at: '2025-12-15T10:00:00Z', + source: 'hub' + }, + failure: { + status: 'error', + slug: 'invalid-preset', + error: 'Preset not found in CrowdSec Hub' + } + }, + + applyResponse: { + success: { + status: 'success', + backup: '/data/charon/data/backups/preset-backup-20251215-100000.tar.gz', + reload_hint: true, + used_cscli: true, + cache_key: 'hub-bot-mitigation-abc123', + slug: 'bot-mitigation-essentials' + }, + failureCrowdSecNotRunning: { + status: 'error', + error: 'CrowdSec is not running. Start CrowdSec before applying presets.' + } + } +} + +export const mockConsoleEnrollment = { + statusNone: { + status: 'none', + key_present: false + }, + + statusPending: { + status: 'pending', + tenant: 'test-org', + agent_name: 'charon-prod', + key_present: true, + last_attempt_at: '2025-12-15T09:00:00Z', + correlation_id: 'req-abc123' + }, + + statusEnrolled: { + status: 'enrolled', + tenant: 'test-org', + agent_name: 'charon-prod', + key_present: true, + enrolled_at: '2025-12-14T10:00:00Z', + last_heartbeat_at: '2025-12-15T09:55:00Z' + }, + + statusFailed: { + status: 'failed', + tenant: 'test-org', + agent_name: 'charon-prod', + key_present: false, + last_error: 'Invalid enrollment key', + last_attempt_at: '2025-12-15T09:00:00Z', + correlation_id: 'err-xyz789' + } +} + +export const mockCrowdSecStatus = { + stopped: { + running: false, + pid: 0, + lapi_available: false + }, + + runningLAPIReady: { + running: true, + pid: 12345, + lapi_available: true, + lapi_url: 'http://127.0.0.1:8085' + }, + + runningLAPINotReady: { + running: true, + pid: 12345, + lapi_available: false, + lapi_url: 'http://127.0.0.1:8085', + lapi_error: 'Connection refused' + } +} +``` --- -## Dependency Impact Analysis +### SUCCESS CRITERIA -### Phase 1: Auto-Initialization Changes (crowdsec_startup.go) +#### Test Coverage Metrics +- [ ] All API clients have 100% function coverage +- [ ] All hooks have 100% branch coverage +- [ ] All utility functions have edge case tests +- [ ] Integration tests cover full workflows -#### Files Directly Modified -- `backend/internal/services/crowdsec_startup.go` (lines 46-71) +#### Bug Detection +- [ ] Tests FAIL when toggle state mismatch occurs +- [ ] Tests FAIL when LAPI availability isn't checked +- [ ] Tests FAIL when cache keys are stale +- [ ] Tests FAIL when enrollment retry logic breaks -#### Dependencies and Required Updates - -**1. Unit Tests - MUST BE UPDATED** -- **File**: `backend/internal/services/crowdsec_startup_test.go` -- **Impact**: Test `TestReconcileCrowdSecOnStartup_NoSecurityConfig` expects the function to skip/return early when no SecurityConfig exists -- **Required Change**: Update test to: - - Create a Settings table entry with `security.crowdsec.enabled = 'true'` - - Verify that SecurityConfig is auto-created with `crowdsec_mode = "local"` - - Verify that CrowdSec process is started (not skipped) -- **Additional Tests Needed**: - - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_SettingsDisabled` - Settings='false' → creates config with mode="disabled", does NOT start - - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_SettingsEnabled` - Settings='true' → creates config with mode="local", DOES start - - `TestReconcileCrowdSecOnStartup_NoSecurityConfig_NoSettingsEntry` - No Settings entry → creates config with mode="disabled", does NOT start - -**2. Integration Tests - VERIFICATION NEEDED** -- **Files**: - - `scripts/crowdsec_integration.sh` - - `scripts/crowdsec_startup_test.sh` - - `scripts/crowdsec_decision_integration.sh` -- **Impact**: These scripts may assume specific startup behavior -- **Verification Required**: - - Do any scripts pre-populate Settings table? - - Do any scripts expect reconciliation to skip on fresh DB? - - Do any scripts verify log output from reconciliation? -- **Action**: Review scripts for assumptions about auto-initialization behavior - -**3. Migration/Upgrade Path - DATABASE CONCERN** -- **Scenario**: Existing installations with Settings='true' but missing SecurityConfig -- **Impact**: After upgrade, reconciliation will auto-create SecurityConfig from Settings (POSITIVE) -- **Risk**: Low - this is the intended fix -- **Documentation**: Should document this as expected behavior in migration guide - -**4. Models - NO CHANGES REQUIRED** -- **File**: `backend/internal/models/security_config.go` -- **Analysis**: SecurityConfig model structure unchanged -- **File**: `backend/internal/models/setting.go` -- **Analysis**: Setting model structure unchanged - -**5. Route Registration - NO CHANGES REQUIRED** -- **File**: `backend/internal/api/routes/routes.go` (line 360) -- **Analysis**: Already calls `ReconcileCrowdSecOnStartup`, no signature changes - -**6. Handler Dependencies - NO CHANGES REQUIRED** -- **File**: `backend/internal/api/handlers/crowdsec_handler.go` -- **Analysis**: Start/Stop handlers operate independently, no coupling to reconciliation logic - -### Phase 2: Logging Enhancement Changes (crowdsec_startup.go) - -#### Files Directly Modified -- `backend/internal/services/crowdsec_startup.go` (lines 91-98) - -#### Dependencies and Required Updates - -**1. Log Aggregation/Parsing - DOCUMENTATION UPDATE** -- **Concern**: Changing log level from Debug → Info increases log volume -- **Impact**: - - Logs will now appear in production (Info is default minimum level) - - Log aggregation tools may need filter updates if they parse specific messages -- **Required**: Update any log parsing scripts or documentation about expected log output - -**2. Integration Tests - POTENTIAL GREP PATTERNS** -- **Files**: `scripts/crowdsec_*.sh` -- **Impact**: If scripts `grep` for specific log messages, they may need updates -- **Action**: Search for log message expectations in scripts - -**3. Documentation - UPDATE REQUIRED** -- **File**: `docs/features.md` -- **Section**: CrowdSec Integration (line 167+) -- **Required Change**: Add note about reconciliation behavior: - ```markdown - #### Startup Behavior - - CrowdSec automatically starts on container restart if: - - SecurityConfig has `crowdsec_mode = "local"` OR - - Settings table has `security.crowdsec.enabled = "true"` - - Check container logs for reconciliation decisions: - - "CrowdSec reconciliation: starting based on SecurityConfig mode='local'" - - "CrowdSec reconciliation: starting based on Settings table override" - - "CrowdSec reconciliation skipped: both SecurityConfig and Settings indicate disabled" - ``` - -**4. Troubleshooting Guide - UPDATE RECOMMENDED** -- **File**: `docs/troubleshooting/` (if exists) or `docs/security.md` -- **Required Change**: Add section on "CrowdSec Not Starting After Restart" - - Explain reconciliation logic - - Show how to check Settings and SecurityConfig tables - - Show example log output - -### Phase 3: Unified Toggle Endpoint (OPTIONAL) - -#### Files Directly Modified -- `backend/internal/api/handlers/crowdsec_handler.go` (new method) -- `backend/internal/api/handlers/crowdsec_handler.go` (RegisterRoutes) -- `frontend/src/api/crowdsec.ts` (new function) -- `frontend/src/pages/Security.tsx` (mutation update) - -#### Dependencies and Required Updates - -**1. Handler Tests - NEW TESTS REQUIRED** -- **File**: `backend/internal/api/handlers/crowdsec_handler_test.go` -- **Required Tests**: - - `TestCrowdsecHandler_Toggle_EnableSuccess` - - `TestCrowdsecHandler_Toggle_DisableSuccess` - - `TestCrowdsecHandler_Toggle_TransactionRollback` (if Start fails) - - `TestCrowdsecHandler_Toggle_VerifyBothTablesUpdated` - -**2. Existing Handlers - DEPRECATION CONSIDERATION** -- **Files**: - - Start handler (line ~167 in crowdsec_handler.go) - - Stop handler (line ~260 in crowdsec_handler.go) -- **Impact**: New toggle endpoint duplicates Start/Stop functionality -- **Decision Required**: - - **Option A**: Keep both for backward compatibility (RECOMMENDED) - - **Option B**: Deprecate Start/Stop, add deprecation warnings - - **Option C**: Remove Start/Stop entirely (BREAKING CHANGE - NOT RECOMMENDED) -- **Recommendation**: Keep Start/Stop handlers unchanged, document toggle as "preferred method" - -**3. Frontend API Layer - MIGRATION PATH** -- **File**: `frontend/src/api/crowdsec.ts` -- **Current Exports**: `startCrowdsec`, `stopCrowdsec`, `statusCrowdsec` -- **After Change**: Add `toggleCrowdsec` to exports (line 75) -- **Backward Compatibility**: Keep existing functions, don't remove them - -**4. Frontend Component - LIMITED SCOPE** -- **File**: `frontend/src/pages/Security.tsx` -- **Impact**: Only `crowdsecPowerMutation` needs updating (lines 86-125) -- **Other Components**: No other components import these functions (verified) -- **Risk**: Low - isolated change - -**5. API Documentation - NEW ENDPOINT** -- **File**: `docs/api.md` (if exists) -- **Required Addition**: Document `/admin/crowdsec/toggle` endpoint - -**6. Integration Tests - NEW TEST CASE** -- **Files**: `scripts/crowdsec_integration.sh` -- **Required Addition**: Test toggle endpoint directly - -**7. Backward Compatibility - ANALYSIS** -- **Frontend**: Existing `/admin/crowdsec/start` and `/admin/crowdsec/stop` endpoints remain functional -- **API Consumers**: External tools using Start/Stop continue to work -- **Risk**: None - purely additive change - -### Cross-Cutting Concerns - -#### Database Migration -- **No schema changes required** - both Settings and SecurityConfig tables already exist -- **Data migration**: None needed - changes are behavioral only - -#### Configuration Files -- **No changes required** - no new environment variables or config files - -#### Docker/Deployment -- **No Dockerfile changes** - all changes are code-level -- **No docker-compose changes** - no new services or volumes - -#### Security Implications -- **Phase 1**: Improves security by respecting user's intent across restarts -- **Phase 2**: No security impact (logging only) -- **Phase 3**: Transaction safety prevents partial updates (improvement) - -#### Performance Considerations -- **Phase 1**: Adds one SQL query during auto-initialization (one-time, on startup) -- **Phase 2**: Minimal - only adds log statements -- **Phase 3**: Minimal - wraps existing logic in transaction - -#### Rollback Safety -- **All phases**: No database schema changes, can be rolled back via git revert -- **Data safety**: No data loss risk - only affects process startup behavior - -### Summary of Required File Updates - -| Phase | Files to Modify | Files to Create | Tests to Add | Docs to Update | -|-------|----------------|-----------------|--------------|----------------| -| **Phase 1** | `crowdsec_startup.go` | None | 3 new unit tests | None (covered in Phase 2) | -| **Phase 2** | `crowdsec_startup.go` | None | None | `features.md`, troubleshooting docs | -| **Phase 3** | `crowdsec_handler.go`, `crowdsec.ts`, `Security.tsx` | None | 4 new handler tests | `api.md` (if exists) | - -### Testing Matrix - -| Scenario | Phase 1 | Phase 2 | Phase 3 | -|----------|---------|---------|---------| -| Fresh install → toggle ON → restart | ✅ Fixes | ✅ Better logs | ✅ Cleaner code | -| Existing install with Settings='true', missing SecurityConfig | ✅ Fixes | ✅ Better logs | N/A | -| Toggle ON → restart → verify logs | ✅ Works | ✅ MUST verify new messages | ✅ Works | -| Toggle OFF → restart → verify logs | ✅ Works | ✅ MUST verify new messages | ✅ Works | -| Start/Stop handlers (backward compat) | N/A | N/A | ✅ MUST verify still work | - -### Missing from Original Plan - -The original plan DID NOT explicitly mention: - -1. **Unit test updates required** - Critical for Phase 1 (`TestReconcileCrowdSecOnStartup_NoSecurityConfig` needs major refactoring) -2. **Integration script verification** - May break if they expect specific behavior -3. **Documentation updates** - Features and troubleshooting guides need new reconciliation behavior documented -4. **Backward compatibility analysis for Phase 3** - Need explicit decision on Start/Stop handler fate -5. **API documentation** - New endpoint needs docs -6. **Testing matrix for all three phases together** - Need to verify they work in combination +#### Code Quality +- [ ] All tests follow existing patterns (vitest, @testing-library/react) +- [ ] Mock data structures are reusable +- [ ] Tests are isolated (no shared state) +- [ ] Error messages are descriptive --- -**END OF SPECIFICATION** +**End of Inventory**