Files
Charon/docs/cerberus.md
GitHub Actions 9ad3afbd22 Fix Rate Limiting Issues
- Updated Definition of Done report with detailed checks and results for backend and frontend tests.
- Documented issues related to race conditions and test failures in QA reports.
- Improved security scan notes and code cleanup status in QA reports.
- Added summaries for rate limit integration test fixes, including root causes and resolutions.
- Introduced new debug and integration scripts for rate limit testing.
- Updated security documentation to reflect changes in configuration and troubleshooting steps.
- Enhanced troubleshooting guides for CrowdSec and Go language server (gopls) errors.
- Improved frontend and scripts README files for clarity and usage instructions.
2025-12-12 19:21:44 +00:00

512 lines
13 KiB
Markdown

# Cerberus Technical Documentation
This document is for developers and advanced users who want to understand how Cerberus works under the hood.
**Looking for the user guide?** See [Security Features](security.md) instead.
---
## What Is Cerberus?
Cerberus is the optional security suite built into Charon. It includes:
- **WAF (Web Application Firewall)** — Inspects requests for malicious payloads
- **CrowdSec** — Blocks IPs based on behavior and reputation
- **Access Lists** — Static allow/deny rules (IP, CIDR, geo)
- **Rate Limiting** — Volume-based abuse prevention (placeholder)
All components are disabled by default and can be enabled independently.
---
## Architecture
### Request Flow
When a request hits Charon:
1. **Check if Cerberus is enabled** (global setting + dynamic database flag)
2. **WAF evaluation** (if `waf_mode != disabled`)
- Increment `charon_waf_requests_total` metric
- Check payload against loaded rulesets
- If suspicious:
- `block` mode: Return 403 + increment `charon_waf_blocked_total`
- `monitor` mode: Log + increment `charon_waf_monitored_total`
3. **ACL evaluation** (if enabled)
- Test client IP against active access lists
- First denial = 403 response
4. **CrowdSec check** (placeholder for future)
5. **Rate limit check** (placeholder for future)
6. **Pass to downstream handler** (if not blocked)
### Middleware Integration
Cerberus runs as Gin middleware on all `/api/v1` routes:
```go
r.Use(cerberusMiddleware.RequestLogger())
```
This means it protects the management API but does not directly inspect traffic to proxied websites (that happens in Caddy).
---
## Threat Model & Protection Coverage
### What Cerberus Protects
| Threat Category | CrowdSec | ACL | WAF | Rate Limit |
|-----------------|----------|-----|-----|------------|
| Known attackers (IP reputation) | ✅ | ❌ | ❌ | ❌ |
| Geo-based attacks | ❌ | ✅ | ❌ | ❌ |
| SQL Injection (SQLi) | ❌ | ❌ | ✅ | ❌ |
| Cross-Site Scripting (XSS) | ❌ | ❌ | ✅ | ❌ |
| Remote Code Execution (RCE) | ❌ | ❌ | ✅ | ❌ |
| **Zero-Day Web Exploits** | ⚠️ | ❌ | ✅ | ❌ |
| DDoS / Volume attacks | ❌ | ❌ | ❌ | ✅ |
| Brute-force login attempts | ✅ | ❌ | ❌ | ✅ |
| Credential stuffing | ✅ | ❌ | ❌ | ✅ |
**Legend:**
- ✅ Full protection
- ⚠️ Partial protection (time-delayed)
- ❌ Not designed for this threat
## Zero-Day Exploit Protection (WAF)
The WAF provides **pattern-based detection** for zero-day exploits:
**How It Works:**
1. Attacker discovers new vulnerability (e.g., SQLi in your login form)
2. Attacker crafts exploit: `' OR 1=1--`
3. WAF inspects request → matches SQL injection pattern → **BLOCKED**
4. Your application never sees the malicious input
**Limitations:**
- Only protects HTTP/HTTPS traffic
- Cannot detect completely novel attack patterns (rare)
- Does not protect against logic bugs in application code
**Effectiveness:**
- **~90% of zero-day web exploits** use known patterns (SQLi, XSS, RCE)
- **~10% are truly novel** and may bypass WAF until rules are updated
## Request Processing Pipeline
```
1. [CrowdSec] Check IP reputation → Block if known attacker
2. [ACL] Check IP/Geo rules → Block if not allowed
3. [WAF] Inspect request payload → Block if malicious pattern
4. [Rate Limit] Count requests → Block if too many
5. [Proxy] Forward to upstream service
```
## Configuration Model
### Database Schema
**SecurityConfig** table:
```go
type SecurityConfig struct {
ID uint `gorm:"primaryKey"`
Name string `json:"name"`
Enabled bool `json:"enabled"`
AdminWhitelist string `json:"admin_whitelist"` // CSV of IPs/CIDRs
CrowdsecMode string `json:"crowdsec_mode"` // disabled, local, external
CrowdsecAPIURL string `json:"crowdsec_api_url"`
CrowdsecAPIKey string `json:"crowdsec_api_key"`
WafMode string `json:"waf_mode"` // disabled, monitor, block
WafRulesSource string `json:"waf_rules_source"` // Ruleset identifier
WafLearning bool `json:"waf_learning"`
RateLimitEnable bool `json:"rate_limit_enable"`
RateLimitBurst int `json:"rate_limit_burst"`
RateLimitRequests int `json:"rate_limit_requests"`
RateLimitWindowSec int `json:"rate_limit_window_sec"`
}
```
### Environment Variables (Fallbacks)
If no database config exists, Charon reads from environment:
- `CERBERUS_SECURITY_WAF_MODE``disabled` | `monitor` | `block`
- `CERBERUS_SECURITY_CROWDSEC_MODE``disabled` | `local` | `external`
- `CERBERUS_SECURITY_CROWDSEC_API_URL` — URL for external CrowdSec bouncer
- `CERBERUS_SECURITY_CROWDSEC_API_KEY` — API key for external bouncer
- `CERBERUS_SECURITY_ACL_ENABLED``true` | `false`
- `CERBERUS_SECURITY_RATELIMIT_ENABLED``true` | `false`
---
## WAF (Web Application Firewall)
### Current Implementation
**Status:** Prototype with placeholder detection
The current WAF checks for `<script>` tags as a proof-of-concept. Full OWASP CRS integration is planned.
```go
func (w *WAF) EvaluateRequest(r *http.Request) (Decision, error) {
if strings.Contains(r.URL.Query().Get("q"), "<script>") {
return Decision{Action: "block", Reason: "XSS detected"}, nil
}
return Decision{Action: "allow"}, nil
}
```
### Future: Coraza Integration
Planned integration with [Coraza WAF](https://coraza.io/) and OWASP Core Rule Set:
```go
waf, err := coraza.NewWAF(coraza.NewWAFConfig().
WithDirectives(loadedRuleContent))
```
This will provide production-grade detection of:
- SQL injection
- Cross-site scripting (XSS)
- Remote code execution
- File inclusion attacks
- And more
### Rulesets
**SecurityRuleSet** table stores rule definitions:
```go
type SecurityRuleSet struct {
ID uint `gorm:"primaryKey"`
Name string `json:"name"`
SourceURL string `json:"source_url"` // Optional URL for rule updates
Mode string `json:"mode"` // owasp, custom
Content string `json:"content"` // Raw rule text
}
```
Manage via `/api/v1/security/rulesets`.
### Prometheus Metrics
```
charon_waf_requests_total{mode="block|monitor"} — Total requests evaluated
charon_waf_blocked_total{mode="block"} — Requests blocked
charon_waf_monitored_total{mode="monitor"} — Requests logged but not blocked
```
Scrape from `/metrics` endpoint (no auth required).
### Structured Logging
WAF decisions emit JSON-like structured logs:
```json
{
"source": "waf",
"decision": "block",
"mode": "block",
"path": "/api/v1/proxy-hosts",
"query": "name=<script>alert(1)</script>",
"ip": "203.0.113.50"
}
```
Use these for dashboard creation and alerting.
---
## Access Control Lists (ACLs)
### How They Work
Each `AccessList` defines:
- **Type:** `whitelist` | `blacklist` | `geo_whitelist` | `geo_blacklist` | `local_only`
- **IPs:** Comma-separated IPs or CIDR blocks
- **Countries:** Comma-separated ISO country codes (US, GB, FR, etc.)
**Evaluation logic:**
- **Whitelist:** If IP matches list → allow; else → deny
- **Blacklist:** If IP matches list → deny; else → allow
- **Geo Whitelist:** If country matches → allow; else → deny
- **Geo Blacklist:** If country matches → deny; else → allow
- **Local Only:** If RFC1918 private IP → allow; else → deny
Multiple ACLs can be assigned to a proxy host. The first denial wins.
### GeoIP Database
Uses MaxMind GeoLite2-Country database:
- Path configured via `CHARON_GEOIP_DB_PATH`
- Default: `/app/data/GeoLite2-Country.mmdb` (Docker)
- Update monthly from MaxMind for accuracy
---
## CrowdSec Integration
### Current Status
**Placeholder.** Configuration models exist but bouncer integration is not yet implemented.
### Planned Implementation
**Local mode:**
- Run CrowdSec agent inside Charon container
- Parse logs from Caddy
- Make decisions locally
**External mode:**
- Connect to existing CrowdSec bouncer via API
- Query IP reputation before allowing requests
---
## Security Decisions
The `SecurityDecision` table logs all security actions:
```go
type SecurityDecision struct {
ID uint `gorm:"primaryKey"`
Source string `json:"source"` // waf, crowdsec, acl, ratelimit, manual
IPAddress string `json:"ip_address"`
Action string `json:"action"` // allow, block, challenge
Reason string `json:"reason"`
Timestamp time.Time `json:"timestamp"`
}
```
**Use cases:**
- Audit trail for compliance
- UI visibility into recent blocks
- Manual override tracking
---
## Self-Lockout Prevention
### Admin Whitelist
**Purpose:** Prevent admins from blocking themselves
**Implementation:**
- Stored in `SecurityConfig.admin_whitelist` as CSV
- Checked before applying any block decision
- If requesting IP matches whitelist → always allow
**Recommendation:** Add your VPN IP, Tailscale IP, or home network before enabling Cerberus.
### Break-Glass Token
**Purpose:** Emergency disable when locked out
**How it works:**
1. Generate via `POST /api/v1/security/breakglass/generate`
2. Returns one-time token (plaintext, never stored hashed)
3. Token can be used in `POST /api/v1/security/disable` to turn off Cerberus
4. Token expires after first use
**Storage:** Tokens are hashed in database using bcrypt.
### Localhost Bypass
Requests from `127.0.0.1` or `::1` may bypass security checks (configurable). Allows local management access even when locked out.
---
## API Reference
### Status
```http
GET /api/v1/security/status
```
Returns:
```json
{
"enabled": true,
"waf_mode": "monitor",
"crowdsec_mode": "local",
"acl_enabled": true,
"ratelimit_enabled": false
}
```
### Enable Cerberus
```http
POST /api/v1/security/enable
Content-Type: application/json
{
"admin_whitelist": "198.51.100.10,203.0.113.0/24"
}
```
Requires either:
- `admin_whitelist` with at least one IP/CIDR
- OR valid break-glass token in header
### Disable Cerberus
```http
POST /api/v1/security/disable
```
Requires either:
- Request from localhost
- OR valid break-glass token in header
### Get/Update Config
```http
GET /api/v1/security/config
POST /api/v1/security/config
```
See SecurityConfig schema above.
### Rulesets
```http
GET /api/v1/security/rulesets
POST /api/v1/security/rulesets
DELETE /api/v1/security/rulesets/:id
```
### Decisions (Audit Log)
```http
GET /api/v1/security/decisions?limit=50
POST /api/v1/security/decisions # Manual override
```
---
## Testing
### Integration Test
Run the Coraza integration test:
```bash
bash scripts/coraza_integration.sh
```
Or via Go:
```bash
cd backend
go test -tags=integration ./integration -run TestCorazaIntegration -v
```
### Manual Testing
1. Enable WAF in `monitor` mode
2. Send request with `<script>` in query string
3. Check `/api/v1/security/decisions` for logged attempt
4. Switch to `block` mode
5. Repeat — should receive 403
---
## Observability
### Recommended Dashboards
**Block Rate:**
```promql
rate(charon_waf_blocked_total[5m]) / rate(charon_waf_requests_total[5m])
```
**Monitor vs Block Comparison:**
```promql
rate(charon_waf_monitored_total[5m])
rate(charon_waf_blocked_total[5m])
```
### Alerting Rules
**High block rate (potential attack):**
```yaml
alert: HighWAFBlockRate
expr: rate(charon_waf_blocked_total[5m]) > 0.3
for: 10m
annotations:
summary: "WAF blocking >30% of requests"
```
**No WAF evaluation (misconfiguration):**
```yaml
alert: WAFNotEvaluating
expr: rate(charon_waf_requests_total[10m]) == 0
for: 15m
annotations:
summary: "WAF received zero requests, check middleware config"
```
---
## Development Roadmap
| Phase | Feature | Status |
|-------|---------|--------|
| 1 | WAF placeholder + metrics | ✅ Complete |
| 2 | ACL implementation | ✅ Complete |
| 3 | Break-glass token | ✅ Complete |
| 4 | Coraza CRS integration | 📋 Planned |
| 5 | CrowdSec local agent | 📋 Planned |
| 6 | Rate limiting enforcement | 📋 Planned |
| 7 | Adaptive learning/tuning | 🔮 Future |
---
## FAQ
### Why is the WAF just a placeholder?
We wanted to ship the architecture and observability first. This lets you enable monitoring, see the metrics, and prepare dashboards before the full rule engine is integrated.
### Can I use my own WAF rules?
Yes, via `/api/v1/security/rulesets`. Upload custom Coraza-compatible rules.
### Does Cerberus protect Caddy's proxy traffic?
Not yet. Currently it only protects the management API (`/api/v1`). Future versions will integrate directly with Caddy's request pipeline to protect proxied traffic.
### Why is monitor mode still blocking?
Known issue with the placeholder implementation. This will be fixed when Coraza integration is complete.
---
## See Also
- [Security Features (User Guide)](security.md)
- [API Documentation](api.md)
- [Features Overview](features.md)