feat: Rename WAF to Coraza in UI and update related tests
- Updated UI components to reflect the renaming of "WAF (Coraza)" to "Coraza". - Removed WAF controls from the Security page and adjusted related tests. - Verified that all frontend tests pass after updating assertions to match the new UI. - Added a test script to package.json for running tests with Vitest. - Adjusted imports for jest-dom to be compatible with Vitest. - Updated TypeScript configuration to include Vitest types for testing.
This commit is contained in:
@@ -1,784 +1,219 @@
|
||||
# CrowdSec Preset Apply Failure - Fix Plan
|
||||
# Implementation Plan: Rename WAF Card to Coraza on Cerberus Dashboard
|
||||
|
||||
**Date:** December 12, 2025
|
||||
**Status:** Analysis Complete - Ready for Implementation
|
||||
**Severity:** High
|
||||
## Overview
|
||||
|
||||
---
|
||||
Modify the WAF card on the Cerberus Dashboard to:
|
||||
|
||||
## Issue Summary
|
||||
1. Rename "WAF" to "Coraza" (consistent with how CrowdSec is named - the card uses the product name)
|
||||
2. Remove the Mode and Rule Set dropdowns from the card (these are handled on the config page)
|
||||
|
||||
User reported error when applying a CrowdSec preset:
|
||||
## Reference: CrowdSec Card Pattern
|
||||
|
||||
```
|
||||
Apply failed: read archive: open /app/data/crowdsec/hub_cache/crowdsecurity/caddy/bundle.tgz: no such file or directory. Backup created at /app/data/crowdsec.backup.20251211-194408
|
||||
```
|
||||
The CrowdSec card naming convention shows that we use the **product name** (CrowdSec) rather than generic terms. Similarly, WAF should become "Coraza" since Coraza is the underlying product.
|
||||
|
||||
---
|
||||
**CrowdSec Card Structure:**
|
||||
|
||||
## Root Cause Analysis
|
||||
- Title: "CrowdSec" (not "IPS" or "Intrusion Prevention System")
|
||||
- Simple enabled/disabled status
|
||||
- Config button navigates to `/security/crowdsec` for detailed configuration
|
||||
|
||||
### The Bug
|
||||
**Current WAF Card Issues:**
|
||||
|
||||
The `Apply()` function in [hub_sync.go](../../backend/internal/crowdsec/hub_sync.go#L535) has a **fatal ordering bug** that destroys the cache before reading from it.
|
||||
|
||||
### Detailed Flow
|
||||
|
||||
1. **Pull Phase (Works Correctly)**
|
||||
- User pulls preset `crowdsecurity/caddy`
|
||||
- `HubCache.Store()` writes to: `/app/data/crowdsec/hub_cache/crowdsecurity/caddy/bundle.tgz`
|
||||
- `CachedPreset.ArchivePath` stores this absolute path
|
||||
|
||||
2. **Apply Phase (Bug Occurs)**
|
||||
```
|
||||
Step 1: loadCacheMeta() → Returns meta.ArchivePath = "/app/data/crowdsec/hub_cache/.../bundle.tgz"
|
||||
Step 2: backupExisting() → RENAMES "/app/data/crowdsec" to "/app/data/crowdsec.backup.TIMESTAMP"
|
||||
⚠️ THIS MOVES THE CACHE TOO! hub_cache is INSIDE crowdsec/
|
||||
Step 3: cscli fails (not available or preset not in hub)
|
||||
Step 4: os.ReadFile(meta.ArchivePath) → FILE NOT FOUND!
|
||||
The path still points to "/app/data/crowdsec/..." but that directory was renamed!
|
||||
```
|
||||
|
||||
### Visual Representation
|
||||
|
||||
**Before Backup:**
|
||||
```
|
||||
/app/data/crowdsec/
|
||||
├── hub_cache/
|
||||
│ └── crowdsecurity/
|
||||
│ └── caddy/
|
||||
│ ├── bundle.tgz ← meta.ArchivePath points here
|
||||
│ ├── preview.yaml
|
||||
│ └── metadata.json
|
||||
├── config.yaml
|
||||
└── other_files/
|
||||
```
|
||||
|
||||
**After `backupExisting()` (line 535):**
|
||||
```
|
||||
/app/data/crowdsec.backup.20251211-194408/ ← Renamed!
|
||||
├── hub_cache/
|
||||
│ └── crowdsecurity/
|
||||
│ └── caddy/
|
||||
│ ├── bundle.tgz ← File is now HERE
|
||||
│ ├── preview.yaml
|
||||
│ └── metadata.json
|
||||
├── config.yaml
|
||||
└── other_files/
|
||||
|
||||
/app/data/crowdsec/ ← Directory no longer exists!
|
||||
```
|
||||
|
||||
**Result:** `os.ReadFile(meta.ArchivePath)` fails because the path `/app/data/crowdsec/hub_cache/.../bundle.tgz` no longer exists.
|
||||
|
||||
---
|
||||
|
||||
## Why This Wasn't Caught Earlier
|
||||
|
||||
1. **Tests use temp directories** - Each test creates fresh directories, so the race condition doesn't manifest
|
||||
2. **cscli path succeeds in CI** - When `cscli` is available and works, the code returns early before hitting the bug
|
||||
3. **Recent changes to backup logic** - The copy-based fallback and backup improvements may have introduced this ordering issue
|
||||
4. **Cache directory nested inside DataDir** - The architecture decision to put `hub_cache` inside `DataDir` (crowdsec config) creates this coupling
|
||||
|
||||
---
|
||||
|
||||
## Fix Options
|
||||
|
||||
### Option A: Read Archive Before Backup (Recommended)
|
||||
|
||||
**Rationale:** Simple, minimal change, maintains existing backup behavior.
|
||||
|
||||
**File:** [backend/internal/crowdsec/hub_sync.go](../../backend/internal/crowdsec/hub_sync.go)
|
||||
|
||||
**Changes:**
|
||||
|
||||
```go
|
||||
func (s *HubService) Apply(ctx context.Context, slug string) (ApplyResult, error) {
|
||||
// ... existing validation code ...
|
||||
|
||||
result := ApplyResult{AppliedPreset: cleanSlug, Status: "failed"}
|
||||
meta, metaErr := s.loadCacheMeta(applyCtx, cleanSlug)
|
||||
if metaErr == nil {
|
||||
result.CacheKey = meta.CacheKey
|
||||
}
|
||||
hasCS := s.hasCSCLI(applyCtx)
|
||||
|
||||
// === NEW: Read archive BEFORE backup ===
|
||||
var archive []byte
|
||||
var archiveErr error
|
||||
if metaErr == nil {
|
||||
archive, archiveErr = os.ReadFile(meta.ArchivePath)
|
||||
if archiveErr != nil {
|
||||
logger.Log().WithError(archiveErr).WithField("archive_path", meta.ArchivePath).Warn("failed to read cached archive before backup")
|
||||
}
|
||||
}
|
||||
// === END NEW ===
|
||||
|
||||
backupPath := filepath.Clean(s.DataDir) + ".backup." + time.Now().Format("20060102-150405")
|
||||
if err := s.backupExisting(backupPath); err != nil {
|
||||
return result, fmt.Errorf("backup: %w", err)
|
||||
}
|
||||
result.BackupPath = backupPath
|
||||
|
||||
// Try cscli first
|
||||
if hasCS {
|
||||
cscliErr := s.runCSCLI(applyCtx, cleanSlug)
|
||||
if cscliErr == nil {
|
||||
result.Status = "applied"
|
||||
result.ReloadHint = true
|
||||
result.UsedCSCLI = true
|
||||
return result, nil
|
||||
}
|
||||
logger.Log().WithField("slug", cleanSlug).WithError(cscliErr).Warn("cscli install failed; attempting cache fallback")
|
||||
}
|
||||
|
||||
// === MODIFIED: Use pre-loaded archive or refresh ===
|
||||
if metaErr != nil || archiveErr != nil {
|
||||
refreshed, refreshErr := s.refreshCache(applyCtx, cleanSlug, metaErr)
|
||||
if refreshErr != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("load cache for %s: %w", cleanSlug, refreshErr)
|
||||
}
|
||||
meta = refreshed
|
||||
result.CacheKey = meta.CacheKey
|
||||
// Re-read archive from refreshed cache location
|
||||
archive, archiveErr = os.ReadFile(meta.ArchivePath)
|
||||
if archiveErr != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("read archive: %w", archiveErr)
|
||||
}
|
||||
}
|
||||
|
||||
// Use the pre-loaded archive bytes
|
||||
if err := s.extractTarGz(applyCtx, archive, s.DataDir); err != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("extract: %w", err)
|
||||
}
|
||||
// === END MODIFIED ===
|
||||
|
||||
result.Status = "applied"
|
||||
result.ReloadHint = true
|
||||
result.UsedCSCLI = false
|
||||
return result, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Option B: Move Cache Outside DataDir
|
||||
|
||||
**Rationale:** Architectural fix - separates transient cache from operational config.
|
||||
|
||||
**Files to modify:**
|
||||
- [backend/internal/api/handlers/crowdsec_handler.go](../../backend/internal/api/handlers/crowdsec_handler.go) - Change cache location
|
||||
- [backend/internal/crowdsec/hub_sync.go](../../backend/internal/crowdsec/hub_sync.go) - Add cache dir parameter
|
||||
|
||||
**Changes:**
|
||||
```go
|
||||
// In NewCrowdsecHandler:
|
||||
// BEFORE:
|
||||
cacheDir := filepath.Join(dataDir, "hub_cache")
|
||||
|
||||
// AFTER:
|
||||
cacheDir := filepath.Join(filepath.Dir(dataDir), "hub_cache")
|
||||
// Results in: /app/data/hub_cache (sibling of crowdsec, not child)
|
||||
```
|
||||
|
||||
**Pros:** Clean separation, cache survives config resets
|
||||
**Cons:** Breaking change for existing installs, requires migration
|
||||
|
||||
### Option C: Selective Backup (Exclude Cache)
|
||||
|
||||
**Rationale:** Only backup config files, not cache.
|
||||
|
||||
**Changes to `backupExisting()`:**
|
||||
```go
|
||||
func (s *HubService) backupExisting(backupPath string) error {
|
||||
// ... existing checks ...
|
||||
|
||||
// Skip hub_cache during backup - it's transient
|
||||
return filepath.WalkDir(s.DataDir, func(path string, d fs.DirEntry, err error) error {
|
||||
if strings.Contains(path, "hub_cache") {
|
||||
return filepath.SkipDir
|
||||
}
|
||||
// ... copy logic ...
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
**Pros:** Faster backups, cache preserved
|
||||
**Cons:** More complex, backup is no longer complete snapshot
|
||||
|
||||
---
|
||||
|
||||
## Recommended Implementation
|
||||
|
||||
**Choose Option A** for these reasons:
|
||||
|
||||
1. **Minimal code change** - Single function modification
|
||||
2. **No breaking changes** - Existing cache paths remain valid
|
||||
3. **No migration needed** - Works immediately
|
||||
4. **Maintains complete backups** - Backup still captures full state
|
||||
5. **Easy to test** - Clear before/after behavior
|
||||
|
||||
---
|
||||
- Title: "WAF (Coraza)" - should just be "Coraza"
|
||||
- Contains Mode and Rule Set dropdowns (should be on config page only)
|
||||
- Inconsistent with CrowdSec card simplicity
|
||||
|
||||
## Files to Modify
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| [backend/internal/crowdsec/hub_sync.go](../../backend/internal/crowdsec/hub_sync.go) | Reorder archive read before backup in `Apply()` |
|
||||
| [backend/internal/crowdsec/hub_sync_test.go](../../backend/internal/crowdsec/hub_sync_test.go) | Add test for apply with backup scenario |
|
||||
| [backend/internal/crowdsec/hub_pull_apply_test.go](../../backend/internal/crowdsec/hub_pull_apply_test.go) | Add regression test |
|
||||
### 1. Frontend: Security Dashboard Page
|
||||
|
||||
---
|
||||
**File:** `frontend/src/pages/Security.tsx`
|
||||
|
||||
## Specific Code Changes
|
||||
**Changes Required:**
|
||||
|
||||
### Change 1: hub_sync.go - Apply() Function
|
||||
| Line(s) | Current Text/Code | New Text/Code | Notes |
|
||||
|---------|------------------|---------------|-------|
|
||||
| 171 | `Cerberus powers CrowdSec, WAF, ACLs, and Rate Limiting.` | `Cerberus powers CrowdSec, Coraza, ACLs, and Rate Limiting.` | Info banner text |
|
||||
| 317 | `{/* WAF - Layer 3: Request Inspection */}` | `{/* Coraza - Layer 3: Request Inspection */}` | Comment update |
|
||||
| 321 | `<h3 className="text-sm font-medium text-white">WAF (Coraza)</h3>` | `<h3 className="text-sm font-medium text-white">Coraza</h3>` | Card title |
|
||||
| 337-340 | Protection text using "WAF" terminology | Keep as-is | Still valid for "Coraza" |
|
||||
| **341-379** | **WAF Mode and Rule Set dropdowns (entire block)** | **REMOVE** | Delete the entire conditional block that renders dropdowns when WAF is enabled |
|
||||
| 383 | `onClick={() => navigate('/security/waf')}` | Keep same path | Route stays the same, just config page |
|
||||
| 385 | `{status.waf.enabled ? 'Manage Rule Sets' : 'Configure'}` | `{status.waf.enabled ? 'Configure' : 'Configure'}` or just `Configure` | Simplify button text |
|
||||
|
||||
**Location:** Lines 514-580
|
||||
**Specific Block to Remove (Lines ~341-379):**
|
||||
|
||||
**Before:**
|
||||
```go
|
||||
func (s *HubService) Apply(ctx context.Context, slug string) (ApplyResult, error) {
|
||||
cleanSlug := sanitizeSlug(slug)
|
||||
// ... validation ...
|
||||
|
||||
result := ApplyResult{AppliedPreset: cleanSlug, Status: "failed"}
|
||||
meta, metaErr := s.loadCacheMeta(applyCtx, cleanSlug)
|
||||
if metaErr == nil {
|
||||
result.CacheKey = meta.CacheKey
|
||||
}
|
||||
hasCS := s.hasCSCLI(applyCtx)
|
||||
|
||||
backupPath := filepath.Clean(s.DataDir) + ".backup." + time.Now().Format("20060102-150405")
|
||||
if err := s.backupExisting(backupPath); err != nil {
|
||||
return result, fmt.Errorf("backup: %w", err)
|
||||
}
|
||||
result.BackupPath = backupPath
|
||||
|
||||
// Try cscli first
|
||||
if hasCS {
|
||||
// ... cscli logic ...
|
||||
}
|
||||
|
||||
if metaErr != nil {
|
||||
// ... refresh cache logic ...
|
||||
}
|
||||
|
||||
archive, err := os.ReadFile(meta.ArchivePath) // ❌ FAILS - file moved by backup!
|
||||
if err != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("read archive: %w", err)
|
||||
}
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**After:**
|
||||
```go
|
||||
func (s *HubService) Apply(ctx context.Context, slug string) (ApplyResult, error) {
|
||||
cleanSlug := sanitizeSlug(slug)
|
||||
// ... validation ...
|
||||
|
||||
result := ApplyResult{AppliedPreset: cleanSlug, Status: "failed"}
|
||||
meta, metaErr := s.loadCacheMeta(applyCtx, cleanSlug)
|
||||
if metaErr == nil {
|
||||
result.CacheKey = meta.CacheKey
|
||||
}
|
||||
hasCS := s.hasCSCLI(applyCtx)
|
||||
|
||||
// ✅ NEW: Read archive into memory BEFORE backup moves the files
|
||||
var archive []byte
|
||||
var archiveReadErr error
|
||||
if metaErr == nil {
|
||||
archive, archiveReadErr = os.ReadFile(meta.ArchivePath)
|
||||
if archiveReadErr != nil {
|
||||
logger.Log().WithError(archiveReadErr).WithField("archive_path", meta.ArchivePath).
|
||||
Warn("failed to read cached archive before backup")
|
||||
}
|
||||
}
|
||||
|
||||
backupPath := filepath.Clean(s.DataDir) + ".backup." + time.Now().Format("20060102-150405")
|
||||
if err := s.backupExisting(backupPath); err != nil {
|
||||
return result, fmt.Errorf("backup: %w", err)
|
||||
}
|
||||
result.BackupPath = backupPath
|
||||
|
||||
// Try cscli first
|
||||
if hasCS {
|
||||
cscliErr := s.runCSCLI(applyCtx, cleanSlug)
|
||||
if cscliErr == nil {
|
||||
result.Status = "applied"
|
||||
result.ReloadHint = true
|
||||
result.UsedCSCLI = true
|
||||
return result, nil
|
||||
}
|
||||
logger.Log().WithField("slug", cleanSlug).WithError(cscliErr).
|
||||
Warn("cscli install failed; attempting cache fallback")
|
||||
}
|
||||
|
||||
// ✅ MODIFIED: Handle cache miss OR failed archive read
|
||||
if metaErr != nil || archiveReadErr != nil {
|
||||
// Need to refresh cache (either wasn't cached or file was unreadable)
|
||||
originalErr := metaErr
|
||||
if originalErr == nil {
|
||||
originalErr = archiveReadErr
|
||||
}
|
||||
refreshed, refreshErr := s.refreshCache(applyCtx, cleanSlug, originalErr)
|
||||
if refreshErr != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
logger.Log().WithError(refreshErr).WithField("slug", cleanSlug).
|
||||
WithField("backup_path", backupPath).
|
||||
Warn("cache refresh failed; rolled back backup")
|
||||
result.ErrorMessage = fmt.Sprintf("load cache for %s: %v", cleanSlug, refreshErr)
|
||||
return result, fmt.Errorf("load cache for %s: %w", cleanSlug, refreshErr)
|
||||
}
|
||||
meta = refreshed
|
||||
result.CacheKey = meta.CacheKey
|
||||
|
||||
// Read from the newly refreshed cache
|
||||
archive, archiveReadErr = os.ReadFile(meta.ArchivePath)
|
||||
if archiveReadErr != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("read archive after refresh: %w", archiveReadErr)
|
||||
}
|
||||
}
|
||||
|
||||
// ✅ Use pre-loaded archive bytes (no file read here)
|
||||
if err := s.extractTarGz(applyCtx, archive, s.DataDir); err != nil {
|
||||
_ = s.rollback(backupPath)
|
||||
return result, fmt.Errorf("extract: %w", err)
|
||||
}
|
||||
|
||||
result.Status = "applied"
|
||||
result.ReloadHint = true
|
||||
result.UsedCSCLI = false
|
||||
return result, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Change 2: Add Regression Test
|
||||
|
||||
**File:** [backend/internal/crowdsec/hub_pull_apply_test.go](../../backend/internal/crowdsec/hub_pull_apply_test.go)
|
||||
|
||||
**New test:**
|
||||
```go
|
||||
func TestApplyReadsArchiveBeforeBackup(t *testing.T) {
|
||||
// This test verifies the fix for the bug where Apply() would:
|
||||
// 1. Load cache metadata (getting archive path)
|
||||
// 2. Backup DataDir (moving the cache!)
|
||||
// 3. Try to read archive from original path (FAIL!)
|
||||
|
||||
baseDir := t.TempDir()
|
||||
dataDir := filepath.Join(baseDir, "crowdsec")
|
||||
cacheDir := filepath.Join(dataDir, "hub_cache")
|
||||
|
||||
// Create cache
|
||||
cache, err := NewHubCache(cacheDir, time.Hour)
|
||||
require.NoError(t, err)
|
||||
|
||||
// Create a mock hub server
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if strings.Contains(r.URL.Path, ".tgz") {
|
||||
// Return a valid tar.gz
|
||||
var buf bytes.Buffer
|
||||
gw := gzip.NewWriter(&buf)
|
||||
tw := tar.NewWriter(gw)
|
||||
content := []byte("test: value\n")
|
||||
tw.WriteHeader(&tar.Header{Name: "test.yaml", Size: int64(len(content)), Mode: 0644})
|
||||
tw.Write(content)
|
||||
tw.Close()
|
||||
gw.Close()
|
||||
w.Write(buf.Bytes())
|
||||
return
|
||||
}
|
||||
if strings.Contains(r.URL.Path, ".yaml") {
|
||||
w.Write([]byte("preview: content"))
|
||||
return
|
||||
}
|
||||
// Index
|
||||
w.Write([]byte(`{"items":[{"name":"test/preset","version":"1.0"}]}`))
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
hub := &HubService{
|
||||
Cache: cache,
|
||||
DataDir: dataDir,
|
||||
HTTPClient: server.Client(),
|
||||
HubBaseURL: server.URL,
|
||||
MirrorBaseURL: server.URL,
|
||||
PullTimeout: 10 * time.Second,
|
||||
ApplyTimeout: 10 * time.Second,
|
||||
}
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
// Pull to populate cache
|
||||
_, err = hub.Pull(ctx, "test/preset")
|
||||
require.NoError(t, err, "pull should succeed")
|
||||
|
||||
// Verify cache exists
|
||||
_, err = cache.Load(ctx, "test/preset")
|
||||
require.NoError(t, err, "cache should exist after pull")
|
||||
|
||||
// Add some extra files to DataDir to make backup more realistic
|
||||
require.NoError(t, os.WriteFile(filepath.Join(dataDir, "config.yaml"), []byte("test: config"), 0644))
|
||||
|
||||
// Apply - this should NOT fail with "read archive: no such file"
|
||||
result, err := hub.Apply(ctx, "test/preset")
|
||||
require.NoError(t, err, "apply should succeed - archive should be read before backup")
|
||||
assert.Equal(t, "applied", result.Status)
|
||||
assert.NotEmpty(t, result.BackupPath)
|
||||
|
||||
// Verify backup was created
|
||||
_, err = os.Stat(result.BackupPath)
|
||||
assert.NoError(t, err, "backup should exist")
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Edge Cases to Consider
|
||||
|
||||
| Scenario | Current Behavior | Fixed Behavior |
|
||||
|----------|-----------------|----------------|
|
||||
| First-time apply (no cache) | Fails with cache miss | Attempts refresh, same behavior |
|
||||
| cscli available and works | Returns early, never hits bug | Same - returns early |
|
||||
| cscli fails, cache exists | **FAILS** - archive moved | Succeeds - archive pre-loaded |
|
||||
| Archive file corrupted | Fails on read | Same - fails on read, but before backup |
|
||||
| Network down during refresh | Fails | Same - fails with clear error |
|
||||
| Large archive (>25MB) | Limited by maxArchiveSize | Same - memory is fine for 25MB |
|
||||
| Concurrent applies | Potential race | Still potential race (separate issue) |
|
||||
|
||||
---
|
||||
|
||||
## Testing Plan
|
||||
|
||||
1. **Unit Tests**
|
||||
- [ ] `TestApplyReadsArchiveBeforeBackup` - New regression test
|
||||
- [ ] Existing `TestPullThenApplyFlow` should still pass
|
||||
- [ ] `TestApplyWithoutPullFails` should still pass
|
||||
|
||||
2. **Integration Tests**
|
||||
- [ ] Manual test in Docker container
|
||||
- [ ] Pull preset via UI
|
||||
- [ ] Apply preset via UI
|
||||
- [ ] Verify no "read archive" error
|
||||
|
||||
3. **Edge Case Tests**
|
||||
- [ ] Apply with expired cache (should refresh)
|
||||
- [ ] Apply with network failure (should error gracefully)
|
||||
- [ ] Apply with cscli available (should use cscli path)
|
||||
|
||||
---
|
||||
|
||||
## Rollout Plan
|
||||
|
||||
1. **Implement fix** in `hub_sync.go`
|
||||
2. **Add regression test** in `hub_pull_apply_test.go`
|
||||
3. **Run full test suite**: `go test ./...`
|
||||
4. **Run pre-commit**: `pre-commit run --all-files`
|
||||
5. **Build and test locally**: `docker build -t charon:local .`
|
||||
6. **Manual verification in container**
|
||||
7. **Commit with**: `fix: read archive before backup in CrowdSec preset apply`
|
||||
|
||||
---
|
||||
|
||||
## Related Files Reference
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| [hub_sync.go](../../backend/internal/crowdsec/hub_sync.go) | HubService.Apply() - main fix location |
|
||||
| [hub_cache.go](../../backend/internal/crowdsec/hub_cache.go) | Cache storage, stores ArchivePath |
|
||||
| [crowdsec_handler.go](../../backend/internal/api/handlers/crowdsec_handler.go) | HTTP handler, initializes cache |
|
||||
| [routes.go](../../backend/internal/api/routes/routes.go) | Sets crowdsecDataDir from config |
|
||||
| [config.go](../../backend/internal/config/config.go) | CrowdSecConfigDir default |
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**Root Cause:** The `Apply()` function backs up the entire DataDir (which includes the cache) before reading the cached archive, resulting in a "file not found" error.
|
||||
|
||||
**Fix:** Read the archive into memory before creating the backup.
|
||||
|
||||
**Impact:** Low risk - the fix only changes the order of operations and doesn't affect the backup or extraction logic.
|
||||
|
||||
**Effort:** ~30 minutes implementation + testing
|
||||
| 1 | Cerberus shows ON by default on first load (should be OFF) | High |
|
||||
| 2 | Cerberus dashboard header shows "disabled" even when enabled | Medium |
|
||||
| 3 | CrowdSec toggle auto-enables when Cerberus is enabled | Medium |
|
||||
| 4 | CrowdSec toggle unresponsive + Config button grayed out | High |
|
||||
|
||||
---
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Issue 1: Cerberus Shows ON by Default
|
||||
|
||||
**Root Cause:** The `feature_flags_handler.go` has a default value of `true` for all feature flags including `feature.cerberus.enabled`.
|
||||
|
||||
**File:** [backend/internal/api/handlers/feature_flags_handler.go#L39-L42](../../backend/internal/api/handlers/feature_flags_handler.go#L39-L42)
|
||||
|
||||
```go
|
||||
// Line 39-42
|
||||
for _, key := range defaultFlags {
|
||||
defaultVal := true // <-- THIS IS THE BUG
|
||||
if v, ok := defaultFlagValues[key]; ok {
|
||||
defaultVal = v
|
||||
}
|
||||
```
|
||||
|
||||
**Problem:** The code sets `defaultVal := true` for all flags, then only overrides it if the key exists in `defaultFlagValues`. However, `feature.cerberus.enabled` is NOT in `defaultFlagValues`:
|
||||
|
||||
```go
|
||||
// Line 29-31
|
||||
var defaultFlagValues = map[string]bool{
|
||||
"feature.crowdsec.console_enrollment": false,
|
||||
}
|
||||
```
|
||||
|
||||
**Result:** On first load with an empty database, `feature.cerberus.enabled` defaults to `true` instead of `false`.
|
||||
|
||||
**Additional Context:**
|
||||
- The [backend/internal/config/config.go#L60](../../backend/internal/config/config.go#L60) correctly defaults `CerberusEnabled` to `false`:
|
||||
```go
|
||||
CerberusEnabled: getEnvAny("false", "CERBERUS_SECURITY_CERBERUS_ENABLED", ...) == "true"
|
||||
```
|
||||
- However, the feature flags handler ignores this config and uses its own default.
|
||||
|
||||
---
|
||||
|
||||
### Issue 2: Dashboard Header Shows "Disabled" Even When Enabled
|
||||
|
||||
**Root Cause:** The header banner logic in `Security.tsx` checks `status.cerberus?.enabled` which comes from the security status API, but there's a **data source mismatch**.
|
||||
|
||||
**Files:**
|
||||
- [frontend/src/pages/Security.tsx#L141-L153](../../frontend/src/pages/Security.tsx#L141-L153) - Header banner logic
|
||||
- [backend/internal/api/handlers/security_handler.go#L35-L49](../../backend/internal/api/handlers/security_handler.go#L35-L49) - Security status API
|
||||
|
||||
**Problem Flow:**
|
||||
|
||||
1. **Security.tsx** checks `status.cerberus?.enabled` from `/api/v1/security/status`
|
||||
2. **security_handler.go** reads from config AND settings table:
|
||||
```go
|
||||
// Line 36-48
|
||||
enabled := h.cfg.CerberusEnabled
|
||||
var settingKey = "security.cerberus.enabled" // <-- WRONG KEY!
|
||||
if h.db != nil {
|
||||
var setting struct{ Value string }
|
||||
if err := h.db.Raw("SELECT value FROM settings WHERE key = ? LIMIT 1", settingKey).Scan(&setting).Error; ...
|
||||
```
|
||||
3. **SystemSettings.tsx** toggles `feature.cerberus.enabled` (via feature flags API)
|
||||
|
||||
**The Mismatch:**
|
||||
|
||||
| Component | Key Used |
|
||||
|-----------|----------|
|
||||
| SystemSettings toggle | `feature.cerberus.enabled` |
|
||||
| Security status API | `security.cerberus.enabled` |
|
||||
|
||||
The toggle writes to `feature.cerberus.enabled` but the security status reads from `security.cerberus.enabled` - **two different keys!**
|
||||
|
||||
---
|
||||
|
||||
### Issue 3: CrowdSec Auto-Enables When Cerberus is Enabled
|
||||
|
||||
**Root Cause:** The `docker-compose.override.yml` and `docker-compose.local.yml` both set `CHARON_SECURITY_CROWDSEC_MODE=local`:
|
||||
|
||||
**File:** [docker-compose.override.yml#L21](../../docker-compose.override.yml#L21)
|
||||
```yaml
|
||||
- CHARON_SECURITY_CROWDSEC_MODE=local
|
||||
```
|
||||
|
||||
**Problem:** When the container starts:
|
||||
1. Config loads with `CrowdSecMode: "local"` from env var
|
||||
2. Security status API returns `crowdsec.enabled: true` because mode is "local"
|
||||
3. Frontend shows CrowdSec as enabled
|
||||
|
||||
**File:** [backend/internal/api/handlers/security_handler.go#L59-L62](../../backend/internal/api/handlers/security_handler.go#L59-L62)
|
||||
```go
|
||||
// Allow runtime override for CrowdSec enabled flag via settings table
|
||||
crowdsecEnabled := mode == "local" // <-- Auto-true if mode is "local"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue 4: CrowdSec Toggle Unresponsive + Config Button Grayed Out
|
||||
|
||||
**Root Cause:** Multiple issues combine to break the toggle:
|
||||
|
||||
**A. Toggle Disabled Logic:**
|
||||
|
||||
**File:** [frontend/src/pages/Security.tsx#L127](../../frontend/src/pages/Security.tsx#L127)
|
||||
```tsx
|
||||
const crowdsecToggleDisabled = cerberusDisabled || crowdsecPowerMutation.isPending
|
||||
{status.waf.enabled && (
|
||||
<div className="mt-3 space-y-3">
|
||||
<div>
|
||||
<label className="text-xs text-gray-400 block mb-1">WAF Mode</label>
|
||||
<select
|
||||
value={securityConfig?.config?.waf_mode || 'block'}
|
||||
onChange={(e) => updateSecurityConfigMutation.mutate({ name: 'default', waf_mode: e.target.value })}
|
||||
className="w-full bg-gray-800 border border-gray-700 rounded px-2 py-1 text-sm text-white"
|
||||
data-testid="waf-mode-select"
|
||||
>
|
||||
<option value="block">Block (deny malicious requests)</option>
|
||||
<option value="monitor">Monitor (log only, don't block)</option>
|
||||
</select>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-gray-400 block mb-1">Active Rule Set</label>
|
||||
<select
|
||||
value={securityConfig?.config?.waf_rules_source || ''}
|
||||
onChange={(e) => updateSecurityConfigMutation.mutate({ name: 'default', waf_rules_source: e.target.value || undefined })}
|
||||
className="w-full bg-gray-800 border border-gray-700 rounded px-2 py-1 text-sm text-white"
|
||||
data-testid="waf-ruleset-select"
|
||||
>
|
||||
<option value="">None (all rule sets)</option>
|
||||
{ruleSetsData?.rulesets?.map((rs) => (
|
||||
<option key={rs.id} value={rs.name}>
|
||||
{rs.name} ({rs.mode === 'blocking' ? 'blocking' : 'detection'})
|
||||
</option>
|
||||
))}
|
||||
</select>
|
||||
{(!ruleSetsData?.rulesets || ruleSetsData.rulesets.length === 0) && (
|
||||
<p className="text-xs text-yellow-500 mt-1">
|
||||
No rule sets configured. Add one below.
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
```
|
||||
|
||||
**File:** [frontend/src/pages/Security.tsx#L126](../../frontend/src/pages/Security.tsx#L126)
|
||||
```tsx
|
||||
const cerberusDisabled = !status.cerberus?.enabled
|
||||
```
|
||||
### 2. Frontend: Layout Navigation
|
||||
|
||||
Since `status.cerberus?.enabled` is `false` due to Issue 2 (wrong settings key), `cerberusDisabled` is `true`, making the toggle disabled.
|
||||
**File:** `frontend/src/components/Layout.tsx`
|
||||
|
||||
**B. Config Button Disabled:**
|
||||
**Changes Required:**
|
||||
|
||||
**File:** [frontend/src/pages/Security.tsx#L128](../../frontend/src/pages/Security.tsx#L128)
|
||||
```tsx
|
||||
const crowdsecControlsDisabled = cerberusDisabled || crowdsecPowerMutation.isPending
|
||||
```
|
||||
| Line | Current Text | New Text |
|
||||
|------|-------------|----------|
|
||||
| 70 | `{ name: 'WAF (Coraza)', path: '/security/waf', icon: '🛡️' }` | `{ name: 'Coraza', path: '/security/waf', icon: '🛡️' }` |
|
||||
|
||||
Same logic - the controls are disabled because Cerberus appears disabled.
|
||||
### 3. Test Files to Update
|
||||
|
||||
**C. Switch Component Event Handling:**
|
||||
#### 3.1 Security Page Tests
|
||||
|
||||
**File:** [frontend/src/components/ui/Switch.tsx#L17-L20](../../frontend/src/components/ui/Switch.tsx#L17-L20)
|
||||
**File:** `frontend/src/pages/__tests__/Security.spec.tsx`
|
||||
|
||||
The Switch component passes `disabled` to the native checkbox input, which prevents click events. This is correct behavior - the issue is the `disabled` prop is incorrectly `true`.
|
||||
**Changes Required:**
|
||||
|
||||
---
|
||||
| Test Name | Change Description |
|
||||
|-----------|-------------------|
|
||||
| `shows WAF mode selector when WAF is enabled` | **DELETE entire test** - Mode selector no longer on dashboard |
|
||||
| `shows WAF ruleset selector with available rulesets` | **DELETE entire test** - Ruleset selector no longer on dashboard |
|
||||
| `calls updateSecurityConfig when WAF mode is changed` | **DELETE entire test** - No mode selector on dashboard |
|
||||
| `calls updateSecurityConfig when WAF ruleset is changed` | **DELETE entire test** - No ruleset selector on dashboard |
|
||||
| `shows warning when no rulesets are configured` | **DELETE entire test** - Warning no longer on dashboard |
|
||||
| `displays correct WAF threat protection summary when enabled` | Keep but update any "WAF" string references if needed |
|
||||
| `does not show WAF controls when WAF is disabled` | **DELETE entire test** - Controls never shown on dashboard now |
|
||||
|
||||
## Recommended Fixes
|
||||
**Tests to delete (Lines ~189-344):**
|
||||
|
||||
### Fix 1: Update Feature Flag Defaults
|
||||
- `it('shows WAF mode selector when WAF is enabled', ...)`
|
||||
- `it('shows WAF ruleset selector with available rulesets', ...)`
|
||||
- `it('calls updateSecurityConfig when WAF mode is changed', ...)`
|
||||
- `it('calls updateSecurityConfig when WAF ruleset is changed', ...)`
|
||||
- `it('shows warning when no rulesets are configured', ...)`
|
||||
- `it('does not show WAF controls when WAF is disabled', ...)`
|
||||
|
||||
**File:** `backend/internal/api/handlers/feature_flags_handler.go`
|
||||
Keep:
|
||||
|
||||
```go
|
||||
// Change defaultFlagValues to include cerberus.enabled as false
|
||||
var defaultFlagValues = map[string]bool{
|
||||
"feature.cerberus.enabled": false, // ADD THIS
|
||||
"feature.crowdsec.console_enrollment": false,
|
||||
"feature.uptime.enabled": true, // Uptime can default ON
|
||||
}
|
||||
```
|
||||
- `it('displays correct WAF threat protection summary when enabled', ...)` - This tests the protection description text, not the dropdowns
|
||||
|
||||
### Fix 2: Align Settings Keys
|
||||
#### 3.2 Security Audit Tests
|
||||
|
||||
**Option A (Recommended):** Update security_handler.go to read from feature flags key
|
||||
**File:** `frontend/src/pages/__tests__/Security.audit.test.tsx`
|
||||
|
||||
**File:** `backend/internal/api/handlers/security_handler.go`
|
||||
**Changes Required:**
|
||||
|
||||
```go
|
||||
// Line 37: Change from
|
||||
var settingKey = "security.cerberus.enabled"
|
||||
// To
|
||||
var settingKey = "feature.cerberus.enabled"
|
||||
```
|
||||
| Line | Current Text | New Text |
|
||||
|------|-------------|----------|
|
||||
| 287 | `expect(screen.getByText('WAF (Coraza)')).toBeInTheDocument()` | `expect(screen.getByText('Coraza')).toBeInTheDocument()` |
|
||||
| 306-315 | `it('WAF controls have proper test IDs when enabled', ...)` | **DELETE entire test** - WAF controls no longer on dashboard |
|
||||
| 344 | `expect(cardNames).toEqual(['CrowdSec', 'Access Control', 'WAF (Coraza)', 'Rate Limiting', 'Live Security Logs'])` | `expect(cardNames).toEqual(['CrowdSec', 'Access Control', 'Coraza', 'Rate Limiting', 'Live Security Logs'])` |
|
||||
|
||||
**Option B:** Create a sync mechanism between feature flags and security settings
|
||||
### 4. WAF Config Page (NO CHANGES NEEDED)
|
||||
|
||||
### Fix 3: Remove CrowdSec Mode Override from Docker Compose
|
||||
**File:** `frontend/src/pages/WafConfig.tsx`
|
||||
|
||||
**Files:**
|
||||
- `docker-compose.override.yml`
|
||||
- `docker-compose.local.yml`
|
||||
The WAF config page already properly uses "WAF" terminology in its title ("WAF Configuration") and references "Coraza" where appropriate. Since this is the configuration page, the Mode and Rule Set selections should remain here. **No changes required.**
|
||||
|
||||
```yaml
|
||||
# Remove or comment out:
|
||||
# - CHARON_SECURITY_CROWDSEC_MODE=local
|
||||
# Or change to:
|
||||
- CHARON_SECURITY_CROWDSEC_MODE=disabled
|
||||
```
|
||||
### 5. WAF Config Tests (NO CHANGES NEEDED)
|
||||
|
||||
### Fix 4: No Additional Fix Needed
|
||||
**File:** `frontend/src/pages/__tests__/WafConfig.spec.tsx`
|
||||
|
||||
Issue 4 is a symptom of Issues 1-2. Once those are fixed:
|
||||
- `cerberusDisabled` will be `false` when Cerberus is enabled
|
||||
- `crowdsecToggleDisabled` will be `false`
|
||||
- `crowdsecControlsDisabled` will be `false`
|
||||
- Toggle and Config button will be interactive
|
||||
These tests test the WafConfig page which is unaffected. **No changes required.**
|
||||
|
||||
---
|
||||
## Summary of Changes
|
||||
|
||||
## Test Scenarios
|
||||
| File | Change Type | Description |
|
||||
|------|-------------|-------------|
|
||||
| `frontend/src/pages/Security.tsx` | Modify | Rename "WAF" → "Coraza", remove Mode/RuleSet dropdowns |
|
||||
| `frontend/src/components/Layout.tsx` | Modify | Rename nav item "WAF (Coraza)" → "Coraza" |
|
||||
| `frontend/src/pages/__tests__/Security.spec.tsx` | Delete tests | Remove 6 tests for WAF dropdown controls |
|
||||
| `frontend/src/pages/__tests__/Security.audit.test.tsx` | Modify | Update card name assertions, remove dropdown test |
|
||||
|
||||
### Test 1: Fresh Install Default State
|
||||
```
|
||||
Given: Clean database, no env vars set
|
||||
When: User loads the Settings > System page
|
||||
Then: Cerberus toggle should be OFF
|
||||
And: /api/v1/feature-flags returns { "feature.cerberus.enabled": false }
|
||||
```
|
||||
## API Changes
|
||||
|
||||
### Test 2: Cerberus Toggle Sync
|
||||
```
|
||||
Given: User is on Settings > System page
|
||||
When: User enables Cerberus toggle
|
||||
Then: /api/v1/security/status returns { "cerberus": { "enabled": true } }
|
||||
And: Security dashboard header banner is NOT displayed
|
||||
```
|
||||
**None required.** This is purely a frontend UI change. The backend API endpoints, types, and data structures remain unchanged.
|
||||
|
||||
### Test 3: CrowdSec Toggle Interaction
|
||||
```
|
||||
Given: Cerberus is enabled
|
||||
And: User is on Security dashboard
|
||||
When: User clicks CrowdSec toggle
|
||||
Then: Toggle should respond to click
|
||||
And: CrowdSec enabled state should change
|
||||
And: Toast notification should appear
|
||||
```
|
||||
## Type Definition Changes
|
||||
|
||||
### Test 4: CrowdSec Config Button
|
||||
```
|
||||
Given: Cerberus is enabled
|
||||
And: User is on Security dashboard
|
||||
When: User clicks CrowdSec "Config" button
|
||||
Then: User should navigate to /security/crowdsec
|
||||
And: Button should NOT be grayed out
|
||||
```
|
||||
**None required.** The SecurityStatus type and related interfaces don't need modification.
|
||||
|
||||
### Test 5: Environment Variable Override
|
||||
```
|
||||
Given: CERBERUS_SECURITY_CERBERUS_ENABLED=true set
|
||||
When: User loads Settings > System (fresh DB)
|
||||
Then: Cerberus toggle should be ON (env override)
|
||||
```
|
||||
## Text/Label Changes Summary
|
||||
|
||||
---
|
||||
| Location | From | To |
|
||||
|----------|------|-----|
|
||||
| Card title (Security.tsx) | `WAF (Coraza)` | `Coraza` |
|
||||
| Nav sidebar (Layout.tsx) | `WAF (Coraza)` | `Coraza` |
|
||||
| Info banner (Security.tsx) | `CrowdSec, WAF, ACLs` | `CrowdSec, Coraza, ACLs` |
|
||||
| Comment (Security.tsx) | `/* WAF - Layer 3 */` | `/* Coraza - Layer 3 */` |
|
||||
| Test assertions | `WAF (Coraza)` | `Coraza` |
|
||||
|
||||
## Implementation Priority
|
||||
## Button Simplification
|
||||
|
||||
| Priority | Fix | Effort | Impact |
|
||||
|----------|-----|--------|--------|
|
||||
| P0 | Fix 2 (Key alignment) | Low | High - Fixes Issues 2, 4 |
|
||||
| P1 | Fix 1 (Default values) | Low | High - Fixes Issue 1 |
|
||||
| P2 | Fix 3 (Docker compose) | Low | Medium - Fixes Issue 3 |
|
||||
The button text on the Coraza card should be simplified:
|
||||
|
||||
---
|
||||
- **Current:** `{status.waf.enabled ? 'Manage Rule Sets' : 'Configure'}`
|
||||
- **New:** `Configure` (always)
|
||||
|
||||
## Files to Modify
|
||||
This matches the CrowdSec card pattern which just shows "Config" regardless of enabled state.
|
||||
|
||||
1. **backend/internal/api/handlers/feature_flags_handler.go** - Add default value for cerberus
|
||||
2. **backend/internal/api/handlers/security_handler.go** - Change settings key to `feature.cerberus.enabled`
|
||||
3. **docker-compose.override.yml** - Remove or change CrowdSec mode
|
||||
4. **docker-compose.local.yml** - Remove or change CrowdSec mode
|
||||
## Implementation Order
|
||||
|
||||
---
|
||||
1. Update `Security.tsx`:
|
||||
- Change card title from "WAF (Coraza)" to "Coraza"
|
||||
- Update banner text
|
||||
- Update comment
|
||||
- Remove the dropdown controls block
|
||||
- Simplify button text
|
||||
|
||||
## Additional Observations
|
||||
2. Update `Layout.tsx`:
|
||||
- Change nav item name
|
||||
|
||||
1. **Dual Control Systems:** There are two overlapping control systems:
|
||||
- Feature flags (`feature.cerberus.enabled`) - toggled in SystemSettings.tsx
|
||||
- Security config (`SecurityConfig.Enabled` in DB) - used by Enable/Disable endpoints
|
||||
3. Update test files:
|
||||
- `Security.spec.tsx`: Remove obsolete tests
|
||||
- `Security.audit.test.tsx`: Update assertions, remove dropdown test
|
||||
|
||||
Consider consolidating to one source of truth.
|
||||
4. Run tests to verify: `cd frontend && npm test`
|
||||
|
||||
2. **Config vs Settings:** The `config.SecurityConfig` struct loaded from env vars is separate from DB-backed `SecurityConfig` model. This creates confusion about which takes precedence.
|
||||
5. Run type check: `cd frontend && npm run type-check`
|
||||
|
||||
3. **No Migration:** When updating default values, existing users may need a migration or reset to see the new defaults.
|
||||
6. Run pre-commit checks
|
||||
|
||||
---
|
||||
## Verification Checklist
|
||||
|
||||
## Code Reference Summary
|
||||
|
||||
| File | Line | Purpose |
|
||||
|------|------|---------|
|
||||
| `feature_flags_handler.go` | L29-31 | Missing cerberus default |
|
||||
| `feature_flags_handler.go` | L39 | `defaultVal := true` bug |
|
||||
| `security_handler.go` | L37 | Wrong settings key |
|
||||
| `Security.tsx` | L126-128 | Disabled state logic |
|
||||
| `SystemSettings.tsx` | L99-105 | Feature toggle UI |
|
||||
| `docker-compose.override.yml` | L21 | CrowdSec mode env var |
|
||||
| `config.go` | L60 | Correct cerberus default |
|
||||
- [ ] Card title shows "Coraza" (not "WAF (Coraza)")
|
||||
- [ ] Nav sidebar shows "Coraza"
|
||||
- [ ] No Mode dropdown on dashboard card
|
||||
- [ ] No Rule Set dropdown on dashboard card
|
||||
- [ ] "Configure" button navigates to `/security/waf` config page
|
||||
- [ ] Mode/Rule Set controls still available on `/security/waf` config page
|
||||
- [ ] All tests pass
|
||||
- [ ] TypeScript compiles without errors
|
||||
- [ ] Pre-commit hooks pass
|
||||
|
||||
Reference in New Issue
Block a user