Charon/docs/security-incident-response.md at 0f0a442d74291e45c63e745eda83a1438ee088b4

Files

GitHub Actions f64e3feef8 chore: clean .gitignore cache

2026-01-26 19:22:05 +00:00

9.4 KiB

Raw Blame History

title, description

title	description
Security Incident Response Plan	Industry-standard incident response procedures for Charon deployments, including detection, containment, recovery, and post-incident review.

Security Incident Response Plan (SIRP)

This document provides a structured approach to handling security incidents in Charon deployments. Following these procedures ensures consistent, effective responses that minimize damage and recovery time.

Incident Classification

Severity Levels

Level	Name	Description	Response Time	Examples
P1	Critical	Active exploitation, data breach, or complete service compromise	Immediate (< 15 min)	Confirmed data exfiltration, ransomware, root access compromise
P2	High	Attempted exploitation, security control bypass, or significant vulnerability	< 1 hour	WAF bypass detected, brute-force attack in progress, credential stuffing
P3	Medium	Suspicious activity, minor vulnerability, or policy violation	< 4 hours	Unusual traffic patterns, failed authentication spike, misconfiguration
P4	Low	Informational security events, minor policy deviations	< 24 hours	Routine blocked requests, scanner traffic, expired certificates

Classification Criteria

Escalate to P1 immediately if:

❌ Confirmed unauthorized access to sensitive data
❌ Active malware or ransomware detected
❌ Complete loss of security controls
❌ Evidence of data exfiltration
❌ Critical infrastructure compromise

Escalate to P2 if:

⚠️ Multiple failed bypass attempts from same source
⚠️ Vulnerability actively being probed
⚠️ Partial security control failure
⚠️ Credential compromise suspected

Detection Methods

Automated Detection

Cerberus Security Dashboard:

Navigate to Cerberus → Dashboard
Monitor the Live Activity section for real-time events
Review Security → Decisions for blocked requests
Check alert notifications (Discord, Slack, email)

Key Indicators to Monitor:

Sudden spike in blocked requests
Multiple blocks from same IP/network
WAF rules triggering on unusual patterns
CrowdSec decisions for known threat actors
Rate limiting thresholds exceeded

Log Analysis:

# View recent security events
docker logs charon | grep -E "(BLOCK|DENY|ERROR)" | tail -100

# Check CrowdSec decisions
docker exec charon cscli decisions list

# Review WAF activity
docker exec charon cat /var/log/coraza-waf.log | tail -50

Manual Detection

Regular Security Reviews:

Weekly review of Cerberus Dashboard
Monthly review of access patterns
Quarterly penetration testing
Annual security audit

Containment Procedures

Immediate Actions (All Severity Levels)

Document the incident start time
Preserve evidence — Do NOT restart containers until logs are captured
Assess scope — Determine affected systems and data

P1/P2 Containment

Step 1: Isolate the Threat

# Block attacking IP immediately
docker exec charon cscli decisions add --ip <ATTACKER_IP> --duration 720h --reason "Incident response"

# If compromise confirmed, stop the container
docker stop charon

# Preserve container state for forensics
docker commit charon charon-incident-$(date +%Y%m%d%H%M%S)

Step 2: Preserve Evidence

# Export all logs
docker logs charon > /tmp/incident-logs-$(date +%Y%m%d%H%M%S).txt 2>&1

# Export CrowdSec decisions
docker exec charon cscli decisions list -o json > /tmp/crowdsec-decisions.json

# Copy data directory
cp -r ./charon-data /tmp/incident-backup-$(date +%Y%m%d%H%M%S)

Step 3: Notify Stakeholders

System administrators
Security team (if applicable)
Management (P1 only)
Legal/compliance (if data breach)

P3/P4 Containment

Block offending IPs via Cerberus Dashboard
Review and update access lists if needed
Document the event in incident log
Continue monitoring

Recovery Steps

Pre-Recovery Checklist

Incident fully contained
Evidence preserved
Root cause identified (or investigation ongoing)
Clean backups available

Recovery Procedure

Step 1: Verify Backup Integrity

# List available backups
ls -la ./charon-data/backups/

# Verify backup can be read
docker run --rm -v ./charon-data/backups:/backups debian:bookworm-slim ls -la /backups

Step 2: Restore from Clean State

# Stop compromised instance
docker stop charon

# Rename compromised data
mv ./charon-data ./charon-data-compromised-$(date +%Y%m%d)

# Restore from backup
cp -r ./charon-data-backup-YYYYMMDD ./charon-data

# Start fresh instance
docker-compose up -d

Step 3: Apply Security Hardening

Review and update all access lists
Rotate any potentially compromised credentials
Update Charon to latest version
Enable additional security features if not already active

Step 4: Verify Recovery

# Check Charon is running
docker ps | grep charon

# Verify LAPI status
docker exec charon cscli lapi status

# Test proxy functionality
curl -I https://your-proxied-domain.com

Communication During Recovery

Update stakeholders every 30 minutes (P1) or hourly (P2)
Document all recovery actions taken
Prepare user communication if service was affected

Post-Incident Review

Review Meeting Agenda

Schedule within 48 hours of incident resolution.

Attendees: All involved responders, system owners, management (P1/P2)

Agenda:

Incident timeline reconstruction
What worked well?
What could be improved?
Action items and owners
Documentation updates needed

Post-Incident Checklist

Incident fully documented
Timeline created with all actions taken
Root cause analysis completed
Lessons learned documented
Security controls reviewed and updated
Monitoring/alerting improved
Team training needs identified
Documentation updated

Incident Report Template

## Incident Report: [INCIDENT-YYYY-MM-DD-###]

**Severity:** P1/P2/P3/P4
**Status:** Resolved / Under Investigation
**Duration:** [Start Time] to [End Time]

### Summary
[Brief description of what happened]

### Timeline
- [HH:MM] - Event detected
- [HH:MM] - Containment initiated
- [HH:MM] - Root cause identified
- [HH:MM] - Recovery completed

### Impact
- Systems affected: [List]
- Data affected: [Yes/No, details]
- Users affected: [Count/scope]
- Service downtime: [Duration]

### Root Cause
[Technical explanation of what caused the incident]

### Actions Taken
1. [Action 1]
2. [Action 2]
3. [Action 3]

### Lessons Learned
- [Learning 1]
- [Learning 2]

### Follow-up Actions
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| [Action] | [Name] | [Date] | Open |

Communication Templates

Internal Notification (P1/P2)

SECURITY INCIDENT ALERT

Severity: [P1/P2]
Time Detected: [YYYY-MM-DD HH:MM UTC]
Status: [Active / Contained / Resolved]

Summary:
[Brief description]

Current Actions:
- [Action being taken]

Next Update: [Time]

Contact: [Incident Commander Name/Channel]

User Communication (If Service Affected)

Service Notification

We are currently experiencing [brief issue description].

Status: [Investigating / Identified / Monitoring / Resolved]
Started: [Time]
Expected Resolution: [Time or "Under investigation"]

We apologize for any inconvenience and will provide updates as available.

Last Updated: [Time]

Post-Incident Summary (External)

Security Incident Summary

On [Date], we identified and responded to a security incident affecting [scope].

What happened:
[Non-technical summary]

What we did:
[Response actions taken]

What we're doing to prevent this:
[Improvements being made]

Was my data affected?
[Clear statement about data impact]

Questions?
[Contact information]

Emergency Contacts

Maintain an up-to-date contact list:

Role	Contact Method	Escalation Time
Primary On-Call	[Phone/Pager]	Immediate
Security Team	[Email/Slack]	< 15 min (P1/P2)
System Administrator	[Phone]	< 1 hour
Management	[Phone]	P1 only

Quick Reference Card

P1 Critical — Immediate Response

⏱️ Start timer, document everything
🔒 Isolate: docker stop charon
📋 Preserve: docker logs charon > incident.log
📞 Notify: Security team, management
🔍 Investigate: Determine scope and root cause
🔧 Recover: Restore from clean backup
📝 Review: Post-incident meeting within 48h

P2 High — Urgent Response

🔒 Block attacker: cscli decisions add --ip <IP>
📋 Capture logs before they rotate
📞 Notify: Security team
🔍 Investigate root cause
🔧 Apply fixes
📝 Document and review

Key Commands

# Block IP immediately
docker exec charon cscli decisions add --ip <IP> --duration 720h

# List all active blocks
docker exec charon cscli decisions list

# Export logs
docker logs charon > incident-$(date +%s).log 2>&1

# Check security status
docker exec charon cscli lapi status

Document Maintenance

Version	Date	Author	Changes
1.0	2025-12-21	Security Team	Initial SIRP creation

Review Schedule: Quarterly or after any P1/P2 incident

Owner: Security Team

Last Reviewed: 2025-12-21

9.4 KiB Raw Blame History