1620 lines
47 KiB
Markdown
1620 lines
47 KiB
Markdown
# Alpine Base Image Migration Specification
|
|
|
|
**Version:** 1.0
|
|
**Created:** February 4, 2026
|
|
**Status:** Planning Phase
|
|
**Estimated Effort:** 40-60 hours (2-3 sprints)
|
|
**Priority:** High (Security Optimization)
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Executive Summary](#executive-summary)
|
|
2. [Research Phase](#research-phase)
|
|
3. [Compatibility Analysis](#compatibility-analysis)
|
|
4. [Dockerfile Changes](#dockerfile-changes)
|
|
5. [Testing Requirements](#testing-requirements)
|
|
6. [Rollback Plan](#rollback-plan)
|
|
7. [Implementation Phases](#implementation-phases)
|
|
8. [Risk Assessment](#risk-assessment)
|
|
9. [Success Metrics](#success-metrics)
|
|
10. [Post-Migration Monitoring](#post-migration-monitoring)
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
### Context
|
|
|
|
**Current State:**
|
|
- Base Image: `debian:trixie-slim` (Debian 13)
|
|
- Security Issues: 7 HIGH CVEs in glibc/libtasn1 (no fixes available)
|
|
- Image Size: ~350MB final image
|
|
- Attack Surface: glibc, apt ecosystem
|
|
|
|
**Historical Context:**
|
|
- Previously migrated from Alpine → Debian due to **CVE-2025-60876** (busybox heap overflow - CRITICAL)
|
|
- CVE-2025-60876 status as of Feb 2026: Likely patched (requires verification)
|
|
- Debian CVE situation worsening: 7 HIGH CVEs with "no fix available"
|
|
|
|
**Migration Driver:**
|
|
- Reduce attack surface (musl libc vs glibc)
|
|
- Smaller base image (~5MB Alpine vs ~120MB Debian base)
|
|
- Faster security updates from Alpine Security Team
|
|
- User roadmap request (identified as priority)
|
|
|
|
### Goals
|
|
|
|
- ✅ Eliminate Debian glibc HIGH CVEs
|
|
- ✅ Reduce Docker image size by 30-40%
|
|
- ✅ Maintain 100% feature parity
|
|
- ✅ Achieve <5% performance variance
|
|
- ✅ Pass all E2E and integration tests
|
|
|
|
### Non-Goals
|
|
|
|
- ❌ Rewrite Go code for Alpine-specific optimizations
|
|
- ❌ Change application architecture
|
|
- ❌ Migrate to Distroless (considered but rejected for complexity)
|
|
|
|
---
|
|
|
|
## Research Phase
|
|
|
|
### 1.1 Alpine Security Posture Analysis
|
|
|
|
#### Historical Critical CVE: CVE-2025-60876
|
|
|
|
**Original Issue (Debian Migration Trigger):**
|
|
- **CVE ID:** CVE-2025-60876
|
|
- **Severity:** MEDIUM (originally reported as CRITICAL)
|
|
- **Affected:** busybox 1.37.0-r20, busybox-binsh 1.37.0-r20, ssl_client 1.37.0-r20
|
|
- **Type:** Heap buffer overflow (CWE-122)
|
|
- **Date Discovered:** January 2026
|
|
|
|
**Current Status (February 2026):**
|
|
- ✅ **LIKELY PATCHED** - Alpine Security typically patches within 2-4 weeks for CRITICAL/HIGH
|
|
- ⚠️ **VERIFICATION REQUIRED** - Must confirm patch before migration
|
|
- 📊 **Verification Method:** Check Alpine Security Advisory page + scan Alpine 3.23.x with Grype
|
|
- 🔗 **Source:** https://security.alpinelinux.org/vuln/busybox
|
|
|
|
**Verification Command:**
|
|
```bash
|
|
# Test Alpine 3.23 latest security posture
|
|
docker run --rm alpine:3.23 /bin/sh -c "apk info busybox"
|
|
grype alpine:3.23 --only-fixed --fail-on critical,high
|
|
```
|
|
|
|
**Expected Result:** Zero HIGH/CRITICAL CVEs in busybox packages
|
|
|
|
#### Current Alpine 3.23 Security State
|
|
|
|
**Latest Version:** alpine:3.23.3 (as of Feb 2026)
|
|
|
|
**Known Vulnerabilities (as of January 2026 scan):**
|
|
- **Busybox CVE-2025-60876:** MEDIUM (heap overflow) - Status: PENDING VERIFICATION
|
|
- **Curl CVE-2025-15079:** MEDIUM (HTTP/2 DoS) - Status: PENDING VERIFICATION
|
|
- **Curl CVE-2025-14819:** MEDIUM (TLS validation) - Status: PENDING VERIFICATION
|
|
|
|
**Alpine vs Debian CVE Comparison:**
|
|
|
|
| Metric | Alpine 3.23 (Jan 2026) | Debian Trixie (Feb 2026) |
|
|
|--------|------------------------|--------------------------|
|
|
| CRITICAL CVEs | 0 | 0 |
|
|
| HIGH CVEs | 0 (unverified) | **7** (glibc, libtasn1) |
|
|
| MEDIUM CVEs | 8 (busybox, curl) | 20 |
|
|
| Patch Availability | Pending verification | ❌ No fixes available |
|
|
| C Library | musl (immune to glibc CVEs) | glibc (7 HIGH CVEs) |
|
|
| Package Manager | apk (smaller, simpler) | apt (complex, larger) |
|
|
| Base Image Size | ~7MB | ~120MB |
|
|
|
|
**Recommendation:** Alpine 3.23.3+ expected to have significantly better security posture than Debian Trixie
|
|
|
|
#### Alpine Version Selection
|
|
|
|
**Candidates:**
|
|
|
|
1. **alpine:3.23.3** (Recommended - Stable)
|
|
- ✅ Latest stable Alpine release
|
|
- ✅ Long-term support through 2026-11
|
|
- ✅ Mature ecosystem, well-tested
|
|
- ✅ Renovate can track minor updates (3.23.x)
|
|
- ⚠️ Must verify busybox CVE is patched
|
|
|
|
2. **alpine:edge** (Not Recommended - Rolling)
|
|
- ⚠️ Rolling release, unstable
|
|
- ⚠️ Breaking changes without warning
|
|
- ⚠️ Not suitable for production
|
|
- ❌ Rejected for reliability concerns
|
|
|
|
3. **alpine:3.22** (Not Recommended - Older)
|
|
- ❌ Older packages, higher CVE risk
|
|
- ❌ End-of-life approaching (Nov 2026)
|
|
- ❌ Rejected for security reasons
|
|
|
|
**Decision:** Use **`alpine:3.23@sha256:...`** with Renovate tracking
|
|
|
|
#### musl vs glibc Compatibility
|
|
|
|
**Charon Application Profile:**
|
|
- **Language:** go 1.25.7 (static binaries with CGO_ENABLED=1 for SQLite)
|
|
- **C Dependencies:** SQLite (libsqlite3-dev)
|
|
- **Go Stdlib Features:** Standard library calls only (net, crypto, http)
|
|
|
|
**musl Compatibility Assessment:**
|
|
|
|
| Component | Debian (glibc) | Alpine (musl) | Compatibility Risk |
|
|
|-----------|---------------|--------------|-------------------|
|
|
| Go Runtime | ✅ glibc-friendly | ✅ musl-friendly | 🟢 **LOW** - Go abstracts libc |
|
|
| SQLite (CGO) | ✅ Built with glibc | ✅ Built with musl | 🟢 **LOW** - API compatible |
|
|
| Caddy Server | ✅ Built with glibc | ✅ Built with musl | 🟢 **LOW** - Go binary, static |
|
|
| CrowdSec | ✅ Built with glibc | ✅ Built with musl | 🟢 **LOW** - Go binary, static |
|
|
| gosu | ✅ Built from source | ✅ Built from source | 🟢 **LOW** - Go binary |
|
|
| DNS Resolution | ✅ glibc NSS | ⚠️ musl resolver | 🟡 **MEDIUM** - See below |
|
|
|
|
**DNS Resolution Differences:**
|
|
|
|
**glibc (Debian):**
|
|
- Uses Name Service Switch (NSS) from `/etc/nsswitch.conf`
|
|
- Supports complex resolution order (DNS, mDNS, LDAP, etc.)
|
|
- Go's `net` package uses cgo DNS resolver by default
|
|
|
|
**musl (Alpine):**
|
|
- Simple resolver, reads `/etc/resolv.conf` directly
|
|
- No NSS support (no `/etc/nsswitch.conf`)
|
|
- Faster, simpler, but less flexible
|
|
|
|
**Impact on Charon:**
|
|
- 🟢 **Minimal** - Charon only does standard DNS queries (A/AAAA records)
|
|
- 🟢 **Go DNS Fallback** - Set `GODEBUG=netdns=go` to use pure Go resolver (no cgo)
|
|
- ⚠️ **Test Required** - DNS provider integrations (Cloudflare, Route53, etc.) must be re-tested
|
|
|
|
**Mitigation:**
|
|
```dockerfile
|
|
# Force Go to use pure Go DNS resolver (no cgo)
|
|
ENV GODEBUG=netdns=go
|
|
```
|
|
|
|
**Reference:**
|
|
- Go DNS Resolver: https://pkg.go.dev/net#hdr-Name_Resolution
|
|
- musl DNS Limitations: https://wiki.musl-libc.org/functional-differences-from-glibc.html
|
|
|
|
### 1.2 Package Ecosystem Research
|
|
|
|
**Research Tool:**
|
|
```bash
|
|
# Analyze Debian packages currently used
|
|
docker run --rm debian:trixie-slim dpkg -l | grep ^ii
|
|
|
|
# Search Alpine equivalents
|
|
docker run --rm alpine:3.23 apk search <package>
|
|
```
|
|
|
|
---
|
|
|
|
## Compatibility Analysis
|
|
|
|
### 2.1 Package Mapping: Debian apt → Alpine apk
|
|
|
|
#### Build Stage Packages (gosu-builder)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `git` | `git` | ✅ Direct match | Same package name |
|
|
| `clang` | `clang` | ✅ Direct match | LLVM toolchain |
|
|
| `lld` | `lld` | ✅ Direct match | LLVM linker |
|
|
| `gcc` | `gcc` | ✅ Direct match | GNU Compiler |
|
|
| `libc6-dev` | `musl-dev` | ⚠️ Different | musl development headers |
|
|
|
|
**Build Script Changes:**
|
|
```diff
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- git clang lld && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
- RUN xx-apt install -y gcc libc6-dev
|
|
+ RUN apk add --no-cache git clang lld
|
|
+ RUN xx-apk add gcc musl-dev
|
|
```
|
|
|
|
#### Build Stage Packages (backend-builder)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `clang` | `clang` | ✅ Direct match | |
|
|
| `lld` | `lld` | ✅ Direct match | |
|
|
| `gcc` | `gcc` | ✅ Direct match | |
|
|
| `libc6-dev` | `musl-dev` | ⚠️ Different | musl headers |
|
|
| `libsqlite3-dev` | `sqlite-dev` | ✅ Direct match | SQLite development |
|
|
|
|
**Build Script Changes:**
|
|
```diff
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- clang lld && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
- RUN xx-apt install -y gcc libc6-dev libsqlite3-dev
|
|
+ RUN apk add --no-cache clang lld
|
|
+ RUN xx-apk add gcc musl-dev sqlite-dev
|
|
```
|
|
|
|
#### Build Stage Packages (caddy-builder)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `git` | `git` | ✅ Direct match | xcaddy requires git |
|
|
|
|
**Build Script Changes:**
|
|
```diff
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends git && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
+ RUN apk add --no-cache git
|
|
```
|
|
|
|
#### Build Stage Packages (crowdsec-builder)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `git` | `git` | ✅ Direct match | |
|
|
| `clang` | `clang` | ✅ Direct match | |
|
|
| `lld` | `lld` | ✅ Direct match | |
|
|
| `gcc` | `gcc` | ✅ Direct match | |
|
|
| `libc6-dev` | `musl-dev` | ⚠️ Different | |
|
|
|
|
**Build Script Changes:**
|
|
```diff
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- git clang lld && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
- RUN xx-apt install -y gcc libc6-dev
|
|
+ RUN apk add --no-cache git clang lld
|
|
+ RUN xx-apk add gcc musl-dev
|
|
```
|
|
|
|
#### Build Stage Packages (crowdsec-fallback)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `curl` | `curl` | ✅ Direct match | |
|
|
| `ca-certificates` | `ca-certificates` | ✅ Direct match | |
|
|
| `tar` | `tar` | ✅ Direct match | Alpine has tar built-in (busybox) |
|
|
|
|
**Build Script Changes:**
|
|
```diff
|
|
# Note: Debian slim does NOT include tar by default - must be explicitly installed
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- curl ca-certificates tar && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
+ RUN apk add --no-cache curl ca-certificates
|
|
# Note: tar is already available in Alpine via busybox
|
|
```
|
|
|
|
#### Runtime Stage Packages (Final Image)
|
|
|
|
| Debian Package | Alpine Equivalent | Status | Notes |
|
|
|----------------|------------------|--------|-------|
|
|
| `bash` | `bash` | ✅ Direct match | Maintenance scripts require bash |
|
|
| `ca-certificates` | `ca-certificates` | ✅ Direct match | SSL certificates |
|
|
| `libsqlite3-0` | `sqlite-libs` | ⚠️ Different | SQLite runtime library |
|
|
| `sqlite3` | `sqlite` | ⚠️ Different | SQLite CLI tool |
|
|
| `tzdata` | `tzdata` | ✅ Direct match | Timezone database |
|
|
| `curl` | `curl` | ✅ Direct match | Healthchecks, scripts |
|
|
| `gettext-base` | `gettext` | ⚠️ Different | envsubst for templates |
|
|
| `libcap2-bin` | `libcap` | ⚠️ Different | setcap for Caddy ports |
|
|
| `libc-ares2` | `c-ares` | ⚠️ Different | DNS resolution library |
|
|
| `binutils` | `binutils` | ✅ Direct match | objdump for debug symbol check |
|
|
|
|
**Runtime Script Changes:**
|
|
```diff
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- bash ca-certificates libsqlite3-0 sqlite3 tzdata curl gettext-base libcap2-bin libc-ares2 binutils && \
|
|
- apt-get upgrade -y && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
+ RUN apk add --no-cache \
|
|
+ bash ca-certificates sqlite-libs sqlite tzdata curl gettext libcap c-ares binutils
|
|
```
|
|
|
|
### 2.2 Critical Integration Points
|
|
|
|
#### 1. CGO-Enabled SQLite
|
|
|
|
**Current Build (Debian):**
|
|
```dockerfile
|
|
RUN CGO_ENABLED=1 xx-go build \
|
|
-ldflags "-s -w" \
|
|
-o charon ./cmd/api
|
|
```
|
|
|
|
**Alpine Consideration:**
|
|
- ✅ **Compatible** - SQLite compiled against musl libc
|
|
- ✅ **No Code Changes** - Go's `mattn/go-sqlite3` driver is libc-agnostic
|
|
- ⚠️ **Test Required** - Database operations (CRUD, migrations, backups)
|
|
|
|
**Validation Test:**
|
|
```bash
|
|
# After Alpine build, verify SQLite functionality
|
|
docker exec charon sqlite3 /app/data/charon.db "PRAGMA integrity_check;"
|
|
# Expected: ok
|
|
```
|
|
|
|
#### 2. Network Calls (DNS Resolution)
|
|
|
|
**Current Behavior (Debian):**
|
|
- Go's `net` package uses cgo DNS resolver by default
|
|
- Queries `/etc/nsswitch.conf` then falls back to `/etc/resolv.conf`
|
|
- Supports mDNS, LDAP, custom NSS modules
|
|
|
|
**Alpine Behavior:**
|
|
- musl libc has no NSS support
|
|
- DNS queries go directly to `/etc/resolv.conf`
|
|
- Simpler, faster, but less flexible
|
|
|
|
**Impact Assessment:**
|
|
|
|
| Feature | Risk Level | Test Required |
|
|
|---------|-----------|---------------|
|
|
| ACME DNS-01 Challenge | 🟡 MEDIUM | ✅ Test all 15 DNS providers |
|
|
| Docker Host Resolution | 🟢 LOW | ✅ Test `host.docker.internal` |
|
|
| Webhook URLs | 🟢 LOW | ✅ Test external webhook delivery |
|
|
| CrowdSec LAPI | 🟢 LOW | ✅ Test `127.0.0.1:8085` connectivity |
|
|
|
|
**Mitigation Strategy:**
|
|
```dockerfile
|
|
# Force Go to use pure Go DNS resolver (bypass cgo)
|
|
ENV GODEBUG=netdns=go
|
|
```
|
|
|
|
**Reference:** https://pkg.go.dev/net#hdr-Name_Resolution
|
|
|
|
#### 3. TLS/SSL Certificates
|
|
|
|
**Current (Debian):**
|
|
- Uses glibc's certificate validation
|
|
- System certificates: `/etc/ssl/certs/ca-certificates.crt`
|
|
|
|
**Alpine:**
|
|
- Uses musl + OpenSSL/LibreSSL
|
|
- System certificates: `/etc/ssl/certs/ca-certificates.crt` (same path)
|
|
|
|
**Impact:**
|
|
- 🟢 **No Changes Required** - Go's `crypto/tls` uses system cert pool via standard path
|
|
- ⚠️ **Test Required** - Let's Encrypt cert validation, webhook HTTPS calls
|
|
|
|
#### 4. Timezone Data
|
|
|
|
**Current (Debian):**
|
|
- Timezone database: `/usr/share/zoneinfo/`
|
|
- Package: `tzdata`
|
|
|
|
**Alpine:**
|
|
- Timezone database: `/usr/share/zoneinfo/`
|
|
- Package: `tzdata` (same structure)
|
|
|
|
**Impact:**
|
|
- 🟢 **No Changes Required** - Go's `time.LoadLocation()` uses standard paths
|
|
|
|
#### 5. Caddy Privileged Port Binding
|
|
|
|
**Current (Debian):**
|
|
- Uses `setcap` from `libcap2-bin` package
|
|
- Command: `setcap 'cap_net_bind_service=+ep' /usr/bin/caddy`
|
|
|
|
**Alpine:**
|
|
- Uses `setcap` from `libcap` package
|
|
- Same command syntax
|
|
|
|
**Build Script:**
|
|
```diff
|
|
# Runtime image - set Caddy capabilities
|
|
- RUN setcap 'cap_net_bind_service=+ep' /usr/bin/caddy
|
|
+ RUN setcap 'cap_net_bind_service=+ep' /usr/bin/caddy
|
|
# No change required - same command
|
|
```
|
|
|
|
#### 6. Shell Scripts (docker-entrypoint.sh)
|
|
|
|
**Current Dependencies:**
|
|
- `bash` shell
|
|
- `envsubst` (from `gettext-base`)
|
|
- `gosu` (privilege dropping)
|
|
- `curl` (healthchecks)
|
|
|
|
**Alpine Changes:**
|
|
```diff
|
|
- gettext-base # Debian package name
|
|
+ gettext # Alpine package name (includes envsubst)
|
|
```
|
|
|
|
**Test Required:**
|
|
- ✅ Container startup sequence
|
|
- ✅ CrowdSec initialization scripts
|
|
- ✅ Database migrations
|
|
|
|
### 2.3 Known Breaking Changes
|
|
|
|
#### None Identified
|
|
|
|
Alpine migration for Go applications is typically seamless due to:
|
|
1. Go's portable standard library
|
|
2. Static binaries (minimize libc surface area)
|
|
3. Similar package ecosystem (apk vs apt naming differences only)
|
|
|
|
**Confidence Level:** 🟢 **HIGH** (95%)
|
|
|
|
---
|
|
|
|
## Dockerfile Changes
|
|
|
|
### 3.1 Current Dockerfile Structure Analysis
|
|
|
|
**Multi-Stage Build Overview:**
|
|
1. **xx** - Cross-compilation helpers (`tonistiigi/xx`)
|
|
2. **gosu-builder** - Build gosu from source (Go 1.25)
|
|
3. **frontend-builder** - Build React frontend (Node 24)
|
|
4. **backend-builder** - Build Go backend (Go 1.25)
|
|
5. **caddy-builder** - Build Caddy with plugins (Go 1.25 + xcaddy)
|
|
6. **crowdsec-builder** - Build CrowdSec (Go 1.25)
|
|
7. **crowdsec-fallback** - Download CrowdSec static binaries (amd64 only)
|
|
8. **Final Runtime** - Debian Trixie-slim runtime image
|
|
|
|
**Total Stages:** 8
|
|
**Final Image Size (Current):** ~350MB
|
|
|
|
### 3.2 Proposed Alpine Dockerfile
|
|
|
|
**Changes Required:** Stages 2, 4, 5, 6, 7, 8
|
|
|
|
#### Stage 2: gosu-builder (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-trixie AS gosu-builder
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
git clang lld && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
RUN xx-apt install -y gcc libc6-dev
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS gosu-builder
|
|
RUN apk add --no-cache git clang lld
|
|
RUN xx-apk add --no-cache gcc musl-dev
|
|
```
|
|
|
|
**Size Impact:** -15MB (Alpine base smaller)
|
|
|
|
#### Stage 4: backend-builder (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-trixie AS backend-builder
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
clang lld && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
RUN xx-apt install -y gcc libc6-dev libsqlite3-dev
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS backend-builder
|
|
RUN apk add --no-cache clang lld
|
|
RUN xx-apk add --no-cache gcc musl-dev sqlite-dev
|
|
```
|
|
|
|
**Size Impact:** -10MB
|
|
|
|
#### Stage 5: caddy-builder (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-trixie AS caddy-builder
|
|
RUN apt-get update && apt-get install -y --no-install-recommends git && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS caddy-builder
|
|
RUN apk add --no-cache git
|
|
```
|
|
|
|
**Size Impact:** -8MB
|
|
|
|
#### Stage 6: crowdsec-builder (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25.6-trixie AS crowdsec-builder
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
git clang lld && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
RUN xx-apt install -y gcc libc6-dev
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM --platform=$BUILDPLATFORM golang:1.25.6-alpine AS crowdsec-builder
|
|
RUN apk add --no-cache git clang lld
|
|
RUN xx-apk add --no-cache gcc musl-dev
|
|
```
|
|
|
|
**Size Impact:** -12MB
|
|
|
|
#### Stage 7: crowdsec-fallback (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM debian:trixie-slim AS crowdsec-fallback
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
curl ca-certificates tar && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM alpine:3.23 AS crowdsec-fallback
|
|
RUN apk add --no-cache curl ca-certificates
|
|
# tar is already available via busybox
|
|
```
|
|
|
|
**Size Impact:** -100MB (Debian slim → Alpine base)
|
|
|
|
#### Stage 8: Final Runtime (Debian → Alpine)
|
|
|
|
**Before (Debian):**
|
|
```dockerfile
|
|
FROM debian:trixie-slim
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
bash ca-certificates libsqlite3-0 sqlite3 tzdata curl gettext-base libcap2-bin libc-ares2 binutils && \
|
|
apt-get upgrade -y && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
```
|
|
|
|
**After (Alpine):**
|
|
```dockerfile
|
|
FROM alpine:3.23
|
|
RUN apk add --no-cache \
|
|
bash ca-certificates sqlite-libs sqlite tzdata curl gettext libcap c-ares binutils
|
|
```
|
|
|
|
**Size Impact:** -100MB (Debian slim → Alpine runtime)
|
|
|
|
### 3.3 Complete Dockerfile Diff
|
|
|
|
**Summary of Changes:**
|
|
```diff
|
|
# Build Stages (golang base images)
|
|
- FROM --platform=$BUILDPLATFORM golang:1.25-trixie
|
|
+ FROM --platform=$BUILDPLATFORM golang:1.25-alpine
|
|
|
|
# Fallback Stage
|
|
- FROM debian:trixie-slim
|
|
+ FROM alpine:3.23
|
|
|
|
# Final Runtime Stage
|
|
- FROM debian:trixie-slim@sha256:...
|
|
+ FROM alpine:3.23@sha256:...
|
|
|
|
# Package Manager Commands
|
|
- RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
- <packages> && \
|
|
- rm -rf /var/lib/apt/lists/*
|
|
+ RUN apk add --no-cache <packages>
|
|
|
|
# Cross-Compilation Package Install
|
|
- RUN xx-apt install -y gcc libc6-dev
|
|
+ RUN xx-apk add --no-cache gcc musl-dev
|
|
|
|
# Package Name Changes
|
|
- libsqlite3-dev → sqlite-dev
|
|
- libc6-dev → musl-dev
|
|
- gettext-base → gettext
|
|
- libsqlite3-0 → sqlite-libs
|
|
- libcap2-bin → libcap
|
|
- libc-ares2 → c-ares
|
|
```
|
|
|
|
**Lines Changed:** ~50 lines (out of ~450 total Dockerfile)
|
|
|
|
**Estimated Effort:** 4-6 hours (including testing)
|
|
|
|
### 3.4 Size Comparison (Estimated)
|
|
|
|
| Component | Debian Trixie | Alpine 3.23 | Savings |
|
|
|-----------|--------------|------------|---------|
|
|
| Base Image | 120MB | 7MB | -113MB |
|
|
| Build Stages | 850MB (intermediate) | 700MB (intermediate) | -150MB |
|
|
| **Final Runtime** | **~350MB** | **~220MB** | **-130MB (-37%)** |
|
|
|
|
**Note:** Final runtime size savings driven by:
|
|
1. Alpine base image (7MB vs 120MB)
|
|
2. Smaller runtime packages (musl vs glibc)
|
|
3. No apt cache/metadata
|
|
|
|
---
|
|
|
|
## Testing Requirements
|
|
|
|
### 4.1 Pre-Migration Verification Tests
|
|
|
|
#### Test 1: Alpine CVE Verification
|
|
|
|
**Objective:** Confirm CVE-2025-60876 (busybox) and related CVEs are patched
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Build test Alpine image with minimal packages
|
|
cat > Dockerfile.alpine-test << 'EOF'
|
|
FROM alpine:3.23
|
|
RUN apk add --no-cache busybox curl ca-certificates
|
|
EOF
|
|
|
|
docker build -t alpine-test:3.23 -f Dockerfile.alpine-test .
|
|
|
|
# Scan with Grype
|
|
grype alpine-test:3.23 --only-fixed --fail-on critical,high --output json \
|
|
> alpine-3.23-scan.json
|
|
|
|
# Scan with Trivy
|
|
trivy image alpine-test:3.23 --severity CRITICAL,HIGH --exit-code 1
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Zero CRITICAL or HIGH CVEs in busybox packages
|
|
- Grype exit code: 0
|
|
- Trivy exit code: 0
|
|
|
|
**Abort Criteria:** If CVE-2025-60876 still present, delay migration and escalate
|
|
|
|
**Timeline:** Before starting Phase 1 (blocking)
|
|
|
|
#### Test 2: Package Availability Check
|
|
|
|
**Objective:** Verify all required Alpine packages exist
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Check each package from compatibility analysis
|
|
docker run --rm alpine:3.23 sh -c "
|
|
apk search bash && \
|
|
apk search ca-certificates && \
|
|
apk search sqlite-libs && \
|
|
apk search sqlite && \
|
|
apk search tzdata && \
|
|
apk search curl && \
|
|
apk search gettext && \
|
|
apk search libcap && \
|
|
apk search c-ares && \
|
|
apk search binutils && \
|
|
apk search gcc && \
|
|
apk search musl-dev && \
|
|
apk search sqlite-dev
|
|
"
|
|
```
|
|
|
|
**Expected Result:** All packages found with versions listed
|
|
|
|
**Abort Criteria:** Any package missing from Alpine repository
|
|
|
|
**Timeline:** Before Phase 1 (blocking)
|
|
|
|
### 4.2 Build-Time Testing
|
|
|
|
#### Test 3: Multi-Architecture Build
|
|
|
|
**Objective:** Verify Alpine Dockerfile builds successfully on amd64 and arm64
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Build for linux/amd64
|
|
docker buildx build --platform linux/amd64 \
|
|
--build-arg VERSION=alpine-test \
|
|
-t charon:alpine-amd64 \
|
|
--load .
|
|
|
|
# Build for linux/arm64
|
|
docker buildx build --platform linux/arm64 \
|
|
--build-arg VERSION=alpine-test \
|
|
-t charon:alpine-arm64 \
|
|
--load .
|
|
```
|
|
|
|
**Validation:**
|
|
```bash
|
|
# Verify binaries built correctly
|
|
docker run --rm charon:alpine-amd64 /app/charon version
|
|
docker run --rm charon:alpine-arm64 /app/charon version
|
|
|
|
# Verify libc linkage (should show musl)
|
|
docker run --rm charon:alpine-amd64 ldd /app/charon
|
|
# Expected: libc.musl-x86_64.so.1 or "statically linked"
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Build succeeds on both architectures
|
|
- Binary reports correct version
|
|
- No glibc dependencies (musl only)
|
|
|
|
**Timeline:** Phase 1 - Week 1
|
|
|
|
#### Test 4: Image Size Verification
|
|
|
|
**Objective:** Confirm 30-40% size reduction
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Compare image sizes
|
|
docker images | grep "charon.*debian"
|
|
docker images | grep "charon.*alpine"
|
|
|
|
# Calculate savings
|
|
echo "Debian size: <debian-mb-size> MB"
|
|
echo "Alpine size: <alpine-mb-size> MB"
|
|
echo "Savings: $(( (<debian> - <alpine>) / <debian> * 100 ))%"
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Alpine image 120-150MB smaller than Debian
|
|
- 30-40% size reduction achieved
|
|
|
|
**Timeline:** Phase 1 - Week 1
|
|
|
|
### 4.3 Runtime Testing (Docker Compose)
|
|
|
|
#### Test 5: Container Startup Sequence
|
|
|
|
**Objective:** Verify docker-entrypoint.sh executes successfully
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Start Alpine container with fresh data volume
|
|
docker-compose -f .docker/compose/docker-compose.alpine-test.yml up -d
|
|
|
|
# Watch startup logs
|
|
docker logs -f charon-alpine
|
|
|
|
# Expected log sequence:
|
|
# 1. Environment variable expansion
|
|
# 2. CrowdSec initialization
|
|
# 3. Database migrations
|
|
# 4. Backend API startup
|
|
# 5. Caddy proxy startup
|
|
# 6. Health check success
|
|
```
|
|
|
|
**Validation Checks:**
|
|
```bash
|
|
# Check all processes running
|
|
docker exec charon-alpine ps aux | grep -E "charon|caddy"
|
|
|
|
# Verify health check
|
|
curl http://localhost:8080/api/v1/health
|
|
# Expected: {"status":"ok"}
|
|
|
|
# Check database file permissions
|
|
docker exec charon-alpine ls -la /app/data/charon.db
|
|
# Expected: charon:charon ownership
|
|
```
|
|
|
|
**Expected Result:** Container starts successfully, all services running, health check passes
|
|
|
|
**Timeline:** Phase 2 - Week 2
|
|
|
|
#### Test 6: Database Operations
|
|
|
|
**Objective:** Verify SQLite CGO binding works with musl libc
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Create test proxy host via API
|
|
curl -X POST http://localhost:8080/api/v1/proxy-hosts \
|
|
-H "Authorization: Bearer $TOKEN" \
|
|
-d '{
|
|
"domain": "alpine-test.local",
|
|
"target": "http://localhost:9000"
|
|
}'
|
|
|
|
# Query database directly
|
|
docker exec charon-alpine sqlite3 /app/data/charon.db \
|
|
"SELECT * FROM proxy_hosts WHERE domain='alpine-test.local';"
|
|
|
|
# Run database integrity check
|
|
docker exec charon-alpine sqlite3 /app/data/charon.db \
|
|
"PRAGMA integrity_check;"
|
|
# Expected: ok
|
|
|
|
# Test migrations
|
|
docker exec charon-alpine /app/charon migrate
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Proxy host created successfully
|
|
- Database queries return correct data
|
|
- Integrity check passes
|
|
- Migrations run without errors
|
|
|
|
**Timeline:** Phase 2 - Week 2
|
|
|
|
#### Test 7: DNS Resolution
|
|
|
|
**Objective:** Verify DNS queries work with musl libc resolver
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Test external DNS resolution
|
|
docker exec charon-alpine nslookup google.com
|
|
docker exec charon-alpine ping -c 1 google.com
|
|
|
|
# Test Docker internal DNS
|
|
docker exec charon-alpine nslookup host.docker.internal
|
|
|
|
# Test within Go application (backend)
|
|
curl -X POST http://localhost:8080/api/v1/test/dns \
|
|
-d '{"hostname":"cloudflare.com"}'
|
|
```
|
|
|
|
**Expected Result:**
|
|
- External DNS resolves correctly
|
|
- Docker internal DNS works
|
|
- Go application DNS calls succeed
|
|
|
|
**Timeline:** Phase 2 - Week 2
|
|
|
|
### 4.4 E2E Testing (Playwright)
|
|
|
|
#### Test 8: Full E2E Test Suite
|
|
|
|
**Objective:** Verify 100% E2E test pass rate with Alpine image
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Start Alpine-based E2E environment
|
|
.github/skills/scripts/skill-runner.sh docker-rebuild-e2e-alpine
|
|
|
|
# Run full Playwright test suite
|
|
npx playwright test --project=chromium --project=firefox --project=webkit
|
|
|
|
# Run with coverage
|
|
.github/skills/scripts/skill-runner.sh test-e2e-playwright-coverage-alpine
|
|
```
|
|
|
|
**Test Coverage:**
|
|
- ✅ Proxy host CRUD operations (15 DNS provider types)
|
|
- ✅ Certificate provisioning (HTTP-01, DNS-01 challenges)
|
|
- ✅ Security settings (ACL, WAF, CrowdSec, Rate Limiting)
|
|
- ✅ User management (create, edit, delete users)
|
|
- ✅ Real-time log streaming (WebSocket)
|
|
- ✅ Docker container discovery
|
|
- ✅ Backup/restore operations
|
|
- ✅ Emergency recovery workflow
|
|
|
|
**Expected Result:**
|
|
- 100% test pass rate (544/544 tests passing)
|
|
- Zero timeout errors
|
|
- Zero element interaction failures
|
|
- Coverage matches Debian baseline (82-85%)
|
|
|
|
**Timeline:** Phase 3 - Week 2-3
|
|
|
|
#### Test 9: DNS Provider Integration Tests
|
|
|
|
**Objective:** Verify all 15 DNS provider plugins work with Alpine
|
|
|
|
**Providers to Test:**
|
|
1. Cloudflare (DNS-01)
|
|
2. Route53 (AWS DNS-01)
|
|
3. Google Cloud DNS
|
|
4. Azure DNS
|
|
5. DigitalOcean DNS
|
|
6. Linode DNS
|
|
7. Vultr DNS
|
|
8. Namecheap DNS
|
|
9. GoDaddy DNS
|
|
10. RFC2136 (BIND DNS)
|
|
11. Manual DNS
|
|
12. Webhook DNS (HTTP)
|
|
13. DuckDNS
|
|
14. acme-dns
|
|
15. PowerDNS
|
|
|
|
**Test Procedure (per provider):**
|
|
```bash
|
|
# Via E2E test
|
|
npx playwright test tests/dns-provider-{provider}.spec.ts
|
|
|
|
# Verification
|
|
docker exec charon-alpine curl http://localhost:2019/config/ | \
|
|
jq '.apps.http.servers.srv0.tls_automation_policies[0].dns'
|
|
# Expected: Provider-specific configuration JSON
|
|
```
|
|
|
|
**Expected Result:** All 15 DNS provider tests pass
|
|
|
|
**Timeline:** Phase 3 - Week 2-3
|
|
|
|
### 4.5 Integration Testing (Go)
|
|
|
|
#### Test 10: Cerberus Security Suite
|
|
|
|
**Objective:** Verify security middleware functions correctly
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Run Cerberus integration tests
|
|
cd backend/integration
|
|
go test -v -tags=integration ./cerberus_integration_test.go
|
|
|
|
# Test WAF (Coraza)
|
|
go test -v -tags=integration ./coraza_integration_test.go
|
|
|
|
# Test CrowdSec
|
|
go test -v -tags=integration ./crowdsec_integration_test.go
|
|
|
|
# Test Rate Limiting
|
|
go test -v -tags=integration ./rate_limit_integration_test.go
|
|
```
|
|
|
|
**Expected Result:**
|
|
- All integration tests pass
|
|
- WAF blocks SQL injection/XSS payloads
|
|
- CrowdSec bans malicious IPs
|
|
- Rate limiting enforces thresholds (429 responses)
|
|
|
|
**Timeline:** Phase 3 - Week 3
|
|
|
|
#### Test 11: Backend Unit Tests
|
|
|
|
**Objective:** Ensure 85% code coverage maintained
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Run backend tests with coverage
|
|
cd backend
|
|
go test -v -cover -coverprofile=coverage.out ./...
|
|
|
|
# Generate coverage report
|
|
go tool cover -html=coverage.out -o coverage.html
|
|
|
|
# Verify threshold
|
|
go tool cover -func=coverage.out | tail -1
|
|
# Expected: total coverage >= 85%
|
|
```
|
|
|
|
**Expected Result:** Coverage ≥ 85%, all tests pass
|
|
|
|
**Timeline:** Phase 3 - Week 3
|
|
|
|
### 4.6 Performance Testing
|
|
|
|
#### Test 12: Request Latency Benchmark
|
|
|
|
**Objective:** Verify <5% performance variance vs Debian
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Debian baseline (existing image)
|
|
docker run -d --name charon-debian wikid82/charon:latest
|
|
|
|
# Alpine candidate
|
|
docker run -d --name charon-alpine charon:alpine-test
|
|
|
|
# Benchmark API endpoints (100 requests each)
|
|
for endpoint in /api/v1/proxy-hosts /api/v1/certificates /api/v1/users; do
|
|
echo "Testing $endpoint"
|
|
|
|
# Debian
|
|
ab -n 100 -c 10 http://localhost:8080$endpoint > debian-$endpoint.txt
|
|
|
|
# Alpine
|
|
ab -n 100 -c 10 http://localhost:8081$endpoint > alpine-$endpoint.txt
|
|
done
|
|
|
|
# Compare results
|
|
grep "Time per request" debian-*.txt
|
|
grep "Time per request" alpine-*.txt
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Alpine latency within 5% of Debian
|
|
- No significant regression in throughput (req/sec)
|
|
|
|
**Acceptable Variance:** ±5%
|
|
|
|
**Timeline:** Phase 4 - Week 3
|
|
|
|
#### Test 13: Memory Usage
|
|
|
|
**Objective:** Compare memory footprint
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Monitor memory usage over 1 hour
|
|
docker stats --no-stream charon-debian > debian-memory.txt
|
|
sleep 3600
|
|
docker stats --no-stream charon-debian >> debian-memory.txt
|
|
|
|
docker stats --no-stream charon-alpine > alpine-memory.txt
|
|
sleep 3600
|
|
docker stats --no-stream charon-alpine >> alpine-memory.txt
|
|
|
|
# Calculate average and peak
|
|
awk '{sum+=$2; peak=($2>peak)?$2:peak} END {print "Avg:", sum/NR, "MB | Peak:", peak, "MB"}' \
|
|
debian-memory.txt alpine-memory.txt
|
|
```
|
|
|
|
**Expected Result:**
|
|
- Alpine memory usage similar or lower than Debian
|
|
- No memory leaks (stable usage over time)
|
|
|
|
**Timeline:** Phase 4 - Week 3
|
|
|
|
### 4.7 Security Testing
|
|
|
|
#### Test 14: CVE Scan (Final Alpine Image)
|
|
|
|
**Objective:** Confirm zero HIGH/CRITICAL CVEs in final image
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Scan with Grype
|
|
grype charon:alpine-test --fail-on critical,high --output sarif \
|
|
> grype-alpine-final.sarif
|
|
|
|
# Scan with Trivy
|
|
trivy image charon:alpine-test --severity CRITICAL,HIGH --exit-code 1 \
|
|
--format sarif > trivy-alpine-final.sarif
|
|
|
|
# Generate comparison report
|
|
diff <(jq -r '.runs[0].results[] | .ruleId' grype-debian.sarif) \
|
|
<(jq -r '.runs[0].results[] | .ruleId' grype-alpine-final.sarif)
|
|
```
|
|
|
|
**Acceptance Criteria:**
|
|
- Zero CRITICAL CVEs
|
|
- Zero HIGH CVEs (or documented risk acceptance)
|
|
- Significant reduction vs Debian (7 HIGH → 0)
|
|
|
|
**Timeline:** Phase 5 - Week 4
|
|
|
|
#### Test 15: SBOM Verification
|
|
|
|
**Objective:** Generate Alpine SBOM and validate no unexpected dependencies
|
|
|
|
**Procedure:**
|
|
```bash
|
|
# Generate SBOM with Syft
|
|
syft charon:alpine-test -o cyclonedx-json > sbom-alpine.cyclonedx.json
|
|
|
|
# Compare base OS packages
|
|
jq -r '.components[] | select(.type=="operating-system") | .name' \
|
|
sbom-debian.cyclonedx.json sbom-alpine.cyclonedx.json
|
|
```
|
|
|
|
**Expected Result:**
|
|
- No unexpected third-party dependencies
|
|
- Base OS: Alpine Linux 3.23.x
|
|
- All packages from Alpine repository
|
|
|
|
**Timeline:** Phase 5 - Week 4
|
|
|
|
### 4.8 Test Pass Criteria
|
|
|
|
**Blocking Issues (Must Pass):**
|
|
- ✅ Alpine CVE verification (Test 1)
|
|
- ✅ Multi-architecture build (Test 3)
|
|
- ✅ Container startup (Test 5)
|
|
- ✅ Database operations (Test 6)
|
|
- ✅ E2E test suite 100% pass (Test 8)
|
|
- ✅ Security CVE scan (Test 14)
|
|
|
|
**Non-Blocking Issues (Can Be Mitigated):**
|
|
- ⚠️ Performance regression <10% (Test 12) - Acceptable if justified
|
|
- ⚠️ DNS resolution edge cases (Test 7) - Can be fixed with `GODEBUG=netdns=go`
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
### 5.1 Rollback Triggers
|
|
|
|
**When to Roll Back:**
|
|
1. **Critical E2E Test Failures:** >10% test failure rate that cannot be fixed within 48 hours
|
|
2. **Security Regression:** New CRITICAL CVE introduced in Alpine 3.23
|
|
3. **Performance Degradation:** >15% latency regression in production
|
|
4. **Data Loss Risk:** Database corruption or migration failures
|
|
5. **User-Facing Bug:** Production incident affecting >50% of users
|
|
|
|
### 5.2 Rollback Procedure
|
|
|
|
#### Step 1: Immediate Traffic Diversion (5 minutes)
|
|
|
|
```bash
|
|
# Stop Alpine container
|
|
docker-compose -f .docker/compose/docker-compose.yml down
|
|
|
|
# Revert docker-compose.yml to Debian image
|
|
git checkout HEAD~1 .docker/compose/docker-compose.yml
|
|
|
|
# Start Debian container
|
|
docker-compose -f .docker/compose/docker-compose.yml up -d
|
|
```
|
|
|
|
#### Step 2: Data Backup Validation (10 minutes)
|
|
|
|
```bash
|
|
# Verify latest backup integrity
|
|
docker exec charon-debian sqlite3 /app/data/charon.db "PRAGMA integrity_check;"
|
|
|
|
# Restore from pre-Alpine backup if needed
|
|
docker exec charon-debian /app/scripts/db-recovery.sh \
|
|
/app/data/backups/charon-pre-alpine-migration.db
|
|
```
|
|
|
|
#### Step 3: Health Verification (5 minutes)
|
|
|
|
```bash
|
|
# Check health endpoints
|
|
curl http://localhost:8080/api/v1/health
|
|
|
|
# Verify proxy routing
|
|
curl -H "Host: test.example.com" http://localhost
|
|
|
|
# Check logs for errors
|
|
docker logs charon-debian | grep -i error
|
|
```
|
|
|
|
**Total Rollback Time:** < 20 minutes
|
|
|
|
### 5.3 Post-Rollback Actions
|
|
|
|
1. **Incident Report:** Document root cause of rollback
|
|
2. **User Communication:** Notify users of temporary Debian revert
|
|
3. **Issue Creation:** File GitHub issue with rollback details
|
|
4. **Root Cause Analysis:** RCA within 48 hours
|
|
5. **Fix Timeline:** Define timeline to address Alpine blockers
|
|
|
|
### 5.4 Rollback Testing (Pre-Migration)
|
|
|
|
**Pre-Migration Validation:**
|
|
```bash
|
|
# Practice rollback procedure in staging
|
|
docker-compose -f .docker/compose/docker-compose.alpine-staging.yml up -d
|
|
sleep 60
|
|
|
|
# Simulate rollback
|
|
docker-compose down
|
|
docker-compose -f .docker/compose/docker-compose.yml up -d
|
|
|
|
# Verify rollback success
|
|
curl http://localhost:8080/api/v1/health
|
|
```
|
|
|
|
**Timeline:** Phase 4 - Week 3 (before production deployment)
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Research and Spike (Week 1 - 8 hours)
|
|
|
|
**Deliverables:**
|
|
- ✅ Alpine 3.23.3 CVE scan results (Test 1)
|
|
- ✅ Package availability verification (Test 2)
|
|
- ✅ Alpine test Dockerfile (proof-of-concept)
|
|
- ✅ Multi-architecture build validation (Test 3)
|
|
|
|
**Success Criteria:**
|
|
- Zero CRITICAL/HIGH CVEs in Alpine base image
|
|
- All required packages available
|
|
- PoC Dockerfile builds successfully on amd64 and arm64
|
|
|
|
**Timeline:** February 5-8, 2026
|
|
|
|
**Assignee:** DevOps Team
|
|
|
|
**Risks:**
|
|
- 🔴 **HIGH:** CVE-2025-60876 not patched → Delay migration
|
|
- 🟡 **MEDIUM:** Missing Alpine packages → Find alternatives
|
|
- 🟢 **LOW:** Build failures → Adjust Dockerfile syntax
|
|
|
|
**Mitigation:**
|
|
- Daily monitoring of Alpine Security Advisory
|
|
- Fallback to older Alpine version (3.22) if needed
|
|
- xx toolkit documentation: https://github.com/tonistiigi/xx
|
|
|
|
### Phase 2: Dockerfile Migration (Week 2 - 12 hours)
|
|
|
|
**Tasks:**
|
|
1. **Update all build stages to Alpine** (4 hours)
|
|
- Replace `golang:1.25-trixie` with `golang:1.25-alpine`
|
|
- Replace `debian:trixie-slim` with `alpine:3.23`
|
|
- Update package manager commands (apt → apk)
|
|
- Update package names (per compatibility analysis)
|
|
|
|
2. **Test local build** (2 hours)
|
|
- Build on amd64
|
|
- Build on arm64 (if available)
|
|
- Verify image size reduction
|
|
|
|
3. **Update CI/CD workflows** (3 hours)
|
|
- Modify `.github/workflows/docker-build.yml`
|
|
- Update image tags (add `alpine` suffix for testing)
|
|
- Create `docker-compose.alpine-test.yml`
|
|
|
|
4. **Documentation updates** (3 hours)
|
|
- Update `README.md` (Alpine base image)
|
|
- Update `ARCHITECTURE.md`
|
|
- Create migration changelog entry
|
|
|
|
**Deliverables:**
|
|
- ✅ Updated `Dockerfile` (all stages Alpine-based)
|
|
- ✅ CI workflow building Alpine image
|
|
- ✅ `docker-compose.alpine-test.yml` for testing
|
|
- ✅ Updated documentation
|
|
|
|
**Success Criteria:**
|
|
- Docker build completes without errors
|
|
- Image size reduced by ≥30%
|
|
- CI pipeline passes (build stage only)
|
|
|
|
**Timeline:** February 11-15, 2026
|
|
|
|
**Assignee:** Backend Team
|
|
|
|
**Risks:**
|
|
- 🔴 **HIGH:** CGO SQLite build failures → Adjust linker flags
|
|
- 🟡 **MEDIUM:** Cross-compilation issues with xx toolkit → Debug with ARM64 VM
|
|
- 🟢 **LOW:** Documentation drift → Use git diff to ensure completeness
|
|
|
|
### Phase 3: Comprehensive Testing (Week 2-3 - 20 hours)
|
|
|
|
**Tasks:**
|
|
1. **Runtime validation** (6 hours)
|
|
- Container startup sequence (Test 5)
|
|
- Database operations (Test 6)
|
|
- DNS resolution (Test 7)
|
|
- Health checks and monitoring
|
|
|
|
2. **E2E test execution** (10 hours)
|
|
- Full Playwright suite (Test 8)
|
|
- DNS provider tests (Test 9)
|
|
- Security feature tests
|
|
- Fix any test failures or timing issues
|
|
|
|
3. **Integration tests** (4 hours)
|
|
- Cerberus security suite (Test 10)
|
|
- Backend unit tests (Test 11)
|
|
- Verify 85% coverage maintained
|
|
|
|
**Deliverables:**
|
|
- ✅ Test results documented in QA report
|
|
- ✅ 100% E2E test pass rate
|
|
- ✅ All integration tests passing
|
|
- ✅ Test failure RCA (if any)
|
|
|
|
**Success Criteria:**
|
|
- All blocking tests pass (Tests 5, 6, 8)
|
|
- No data corruption or startup failures
|
|
- Coverage threshold maintained (≥85%)
|
|
|
|
**Timeline:** February 16-22, 2026
|
|
|
|
**Assignee:** QA Team + Backend Team
|
|
|
|
**Risks:**
|
|
- 🔴 **HIGH:** E2E test failures >10% → Rollback to Debian
|
|
- 🟡 **MEDIUM:** DNS provider integration issues → Use `GODEBUG=netdns=go` workaround
|
|
- 🟡 **MEDIUM:** Performance regression → Investigate musl vs glibc trade-offs
|
|
- 🟢 **LOW:** Flaky tests → Re-run with retries, improve test stability
|
|
|
|
### Phase 4: Performance and Security Validation (Week 3 - 8 hours)
|
|
|
|
**Tasks:**
|
|
1. **Performance benchmarking** (4 hours)
|
|
- Request latency benchmark (Test 12)
|
|
- Memory usage analysis (Test 13)
|
|
- Compare with Debian baseline
|
|
- Document any regressions
|
|
|
|
2. **Security scanning** (2 hours)
|
|
- Final CVE scan (Test 14)
|
|
- SBOM generation and verification (Test 15)
|
|
- Compare CVE counts with Debian
|
|
|
|
3. **Rollback testing** (2 hours)
|
|
- Practice rollback procedure
|
|
- Verify rollback completes in <20 minutes
|
|
- Document rollback steps
|
|
|
|
**Deliverables:**
|
|
- ✅ Performance comparison report
|
|
- ✅ Security scan results (SARIF + reports)
|
|
- ✅ Rollback procedure validation
|
|
- ✅ Risk acceptance document (if any CVEs found)
|
|
|
|
**Success Criteria:**
|
|
- Performance within 5% of Debian (acceptable: ±10%)
|
|
- Zero HIGH/CRITICAL CVEs (or documented acceptance)
|
|
- Rollback procedure validated
|
|
|
|
**Timeline:** February 23-25, 2026
|
|
|
|
**Assignee:** DevOps + Security Teams
|
|
|
|
**Risks:**
|
|
- 🟡 **MEDIUM:** Performance regression >10% → Profile and optimize
|
|
- 🟢 **LOW:** New Alpine CVEs discovered → Document and monitor
|
|
|
|
### Phase 5: Staging Deployment (Week 4 - 4 hours)
|
|
|
|
**Tasks:**
|
|
1. **Deploy to staging environment** (1 hour)
|
|
- Update staging `docker-compose.yml`
|
|
- Deploy Alpine image
|
|
- Monitor for 48 hours
|
|
|
|
2. **User acceptance testing** (2 hours)
|
|
- Smoke test all features
|
|
- Invite beta users to test
|
|
- Gather feedback
|
|
|
|
3. **Documentation finalization** (1 hour)
|
|
- Update `CHANGELOG.md`
|
|
- Create migration announcement
|
|
- Prepare release notes
|
|
|
|
**Deliverables:**
|
|
- ✅ Staging deployment successful
|
|
- ✅ User feedback collected
|
|
- ✅ Final documentation complete
|
|
|
|
**Success Criteria:**
|
|
- No critical bugs in staging
|
|
- Positive user feedback
|
|
- Zero production rollbacks
|
|
|
|
**Timeline:** February 26-28, 2026
|
|
|
|
**Assignee:** DevOps + Product Team
|
|
|
|
### Phase 6: Production Deployment (Week 5 - 2 hours)
|
|
|
|
**Tasks:**
|
|
1. **Production release preparation**
|
|
- Tag Docker image: `wikid82/charon:2.x.0-alpine`
|
|
- Create GitHub release
|
|
- Publish release notes
|
|
|
|
2. **Gradual rollout**
|
|
- Canary deployment (10% traffic) - 24 hours
|
|
- Expand to 50% traffic - 24 hours
|
|
- Full rollout - 24 hours
|
|
|
|
3. **Post-deployment monitoring**
|
|
- Monitor error rates
|
|
- Check performance metrics
|
|
- Respond to user reports
|
|
|
|
**Deliverables:**
|
|
- ✅ Production deployment complete
|
|
- ✅ Alpine default for new installations
|
|
- ✅ Migration guide for existing users
|
|
|
|
**Success Criteria:**
|
|
- Zero critical incidents in first 72 hours
|
|
- <1% error rate increase
|
|
- User feedback positive
|
|
|
|
**Timeline:** March 3-5, 2026
|
|
|
|
**Assignee:** DevOps Lead
|
|
|
|
---
|
|
|
|
## Risk Assessment
|
|
|
|
### 7.1 Technical Risks
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|-------------|--------|------------|
|
|
| **CVE-2025-60876 still present in Alpine 3.23** | 🟢 LOW (5%) | 🔴 CRITICAL | Verify with Grype scan before Phase 1 (blocking) |
|
|
| **CGO SQLite incompatibility with musl** | 🟢 LOW (10%) | 🔴 HIGH | Test database operations in Phase 2 (Test 6) |
|
|
| **DNS resolution issues with musl resolver** | 🟡 MEDIUM (30%) | 🟡 MEDIUM | Use `GODEBUG=netdns=go` workaround |
|
|
| **E2E test failures >10%** | 🟡 MEDIUM (20%) | 🔴 HIGH | Comprehensive testing in Phase 3 (Tests 8-9) |
|
|
| **Performance regression >10%** | 🟢 LOW (15%) | 🟡 MEDIUM | Benchmark in Phase 4 (Test 12), acceptable if <15% |
|
|
| **New Alpine CVEs discovered mid-migration** | 🟢 LOW (5%) | 🟡 MEDIUM | Daily CVE monitoring, risk acceptance if needed |
|
|
| **Docker Hub/GHCR Alpine image unavailable** | 🟢 VERY LOW (2%) | 🟡 MEDIUM | Pin specific SHA256, Renovate tracks updates |
|
|
| **User data corruption during migration** | 🟢 VERY LOW (1%) | 🔴 CRITICAL | No schema changes, automatic backups, rollback tested |
|
|
|
|
**Overall Risk Level:** 🟡 **MEDIUM** (manageable with comprehensive testing)
|
|
|
|
### 7.2 Business Risks
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|-------------|--------|------------|
|
|
| **User resistance to Alpine migration** | 🟡 MEDIUM (25%) | 🟢 LOW | Clear communication, benefits highlighted |
|
|
| **Support requests increase** | 🟡 MEDIUM (30%) | 🟢 LOW | Migration guide, FAQ, troubleshooting docs |
|
|
| **Breaking change for existing users** | 🟢 LOW (10%) | 🟡 MEDIUM | No breaking changes planned, rollback available |
|
|
| **Community backlash** | 🟢 LOW (5%) | 🟢 LOW | Transparent process, user testing in staging |
|
|
|
|
### 7.3 Timeline Risks
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|-------------|--------|------------|
|
|
| **Phase 1 delay (CVE not patched)** | 🟡 MEDIUM (20%) | 🔴 HIGH | Buffer 2 weeks, escalate to Alpine Security Team |
|
|
| **Phase 3 extended testing** | 🟡 MEDIUM (40%) | 🟡 MEDIUM | Allocate 2 weeks for comprehensive testing |
|
|
| **Production rollback required** | 🟢 LOW (10%) | 🔴 HIGH | Rollback procedure practiced, <20min downtime |
|
|
|
|
---
|
|
|
|
## Success Metrics
|
|
|
|
### 8.1 Security Metrics
|
|
|
|
| Metric | Baseline (Debian) | Target (Alpine) | Success Criteria |
|
|
|--------|------------------|-----------------|------------------|
|
|
| CRITICAL CVEs | 0 | 0 | ✅ Maintained |
|
|
| HIGH CVEs | 7 | 0 | ✅ 100% reduction |
|
|
| MEDIUM CVEs | 20 | <15 | ✅ 25% reduction |
|
|
| glibc CVEs | 7 | 0 | ✅ Eliminated (musl) |
|
|
| Attack Surface (Base Image) | 120MB | 7MB | ✅ 94% reduction |
|
|
|
|
### 8.2 Performance Metrics
|
|
|
|
| Metric | Baseline (Debian) | Target (Alpine) | Success Criteria |
|
|
|--------|------------------|-----------------|------------------|
|
|
| Image Size (Final) | 350MB | 220MB | ✅ 37% reduction |
|
|
| API Latency (P99) | 200ms | <220ms | ✅ <10% increase |
|
|
| Memory Usage (Idle) | 180MB | <200MB | ✅ <10% increase |
|
|
| Startup Time | 15s | <18s | ✅ <20% increase |
|
|
|
|
### 8.3 Quality Metrics
|
|
|
|
| Metric | Baseline (Debian) | Target (Alpine) | Success Criteria |
|
|
|--------|------------------|-----------------|------------------|
|
|
| E2E Test Pass Rate | 100% (544/544) | 100% | ✅ Maintained |
|
|
| Backend Coverage | 85% | ≥85% | ✅ Maintained |
|
|
| Frontend Coverage | 82% | ≥82% | ✅ Maintained |
|
|
| Integration Tests | 100% pass | 100% pass | ✅ Maintained |
|
|
|
|
### 8.4 User Experience Metrics
|
|
|
|
| Metric | Baseline (Debian) | Target (Alpine) | Success Criteria |
|
|
|--------|------------------|-----------------|------------------|
|
|
| Feature Parity | 100% | 100% | ✅ No regressions |
|
|
| Bug Reports (30 days) | <5 | <10 | ✅ Acceptable increase |
|
|
| User Satisfaction | 90% | ≥85% | ✅ Minor drop acceptable |
|
|
|
|
---
|
|
|
|
## Post-Migration Monitoring
|
|
|
|
### 9.1 Continuous Monitoring (First 90 Days)
|
|
|
|
**Daily Checks (Automated):**
|
|
```yaml
|
|
# .github/workflows/alpine-monitoring.yml
|
|
name: Alpine Security Monitoring
|
|
on:
|
|
schedule:
|
|
- cron: '0 2 * * *' # Daily at 02:00 UTC
|
|
|
|
jobs:
|
|
scan:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Pull latest Alpine image
|
|
run: docker pull wikid82/charon:latest
|
|
|
|
- name: Scan with Grype
|
|
run: grype wikid82/charon:latest --fail-on high --output sarif > grype.sarif
|
|
|
|
- name: Compare with baseline
|
|
run: |
|
|
diff grype-baseline.sarif grype.sarif || \
|
|
gh issue create --title "New CVE detected in Alpine image" \
|
|
--body "$(cat grype.sarif)"
|
|
```
|
|
|
|
**Weekly Performance Reviews:**
|
|
- API latency percentiles (P50, P95, P99)
|
|
- Memory usage trends
|
|
- Error rate changes
|
|
- User-reported issues
|
|
|
|
**Monthly CVE Reports:**
|
|
- Count of HIGH/CRITICAL CVEs
|
|
- Comparison with Debian Trixie
|
|
- Risk acceptance review
|
|
- Security advisory updates
|
|
|
|
### 9.2 Alerting Thresholds
|
|
|
|
**Immediate Escalation (Slack + PagerDuty):**
|
|
- CRITICAL CVE discovered in Alpine base image
|
|
- Container crash loop (>3 restarts in 5 minutes)
|
|
- API error rate >5%
|
|
- Memory usage >90%
|
|
|
|
**Daily Alert (Slack):**
|
|
- New HIGH CVE in Alpine packages
|
|
- E2E test failures in CI
|
|
- Performance degradation >10% vs baseline
|
|
|
|
**Weekly Report (Email):**
|
|
- CVE scan summary
|
|
- Performance metrics trend
|
|
- User feedback summary
|
|
|
|
### 9.3 Maintenance Schedule
|
|
|
|
**Monthly Tasks:**
|
|
1. Update Alpine base image to latest patch version (Renovate automated)
|
|
2. Re-run full E2E test suite
|
|
3. Review and update CVE risk acceptance documents
|
|
4. Check Alpine Security Advisory for upcoming patches
|
|
|
|
**Quarterly Tasks:**
|
|
1. Major Alpine version upgrade (e.g., 3.23 → 3.24)
|
|
2. Comprehensive security audit (Grype + Trivy + CodeQL)
|
|
3. Performance benchmarking vs Debian
|
|
4. SBOM regeneration and validation
|
|
|
|
---
|
|
|
|
## Appendices
|
|
|
|
### A. Alpine Security Resources
|
|
|
|
- **Alpine Security Advisories:** https://security.alpinelinux.org/
|
|
- **Alpine Package Search:** https://pkgs.alpinelinux.org/packages
|
|
- **Alpine Wiki - musl vs glibc:** https://wiki.alpinelinux.org/wiki/Comparison_with_other_distros
|
|
- **Go on Alpine:** https://wiki.alpinelinux.org/wiki/Go
|
|
|
|
### B. Related Documentation
|
|
|
|
- **Current Security Advisory:** `docs/security/advisory_2026-02-01_base_image_cves.md`
|
|
- **QA Report (Debian CVEs):** `docs/reports/qa_report.md` (Section 5.2)
|
|
- **Alpine Vulnerability Acceptance:** `docs/security/VULNERABILITY_ACCEPTANCE.md`
|
|
- **Docker Best Practices:** `.github/instructions/containerization-docker-best-practices.instructions.md`
|
|
|
|
### C. Contacts
|
|
|
|
- **Security Team Lead:** security-lead@example.com
|
|
- **DevOps Lead:** devops-lead@example.com
|
|
- **Alpine Security Team:** security@alpinelinux.org (for CVE inquiries)
|
|
- **Community Forum:** https://gitlab.alpinelinux.org/alpine/aports/-/issues
|
|
|
|
### D. Approval Sign-Off
|
|
|
|
**Planning Approval:**
|
|
- [ ] Security Team Lead
|
|
- [ ] Backend Team Lead
|
|
- [ ] DevOps Team Lead
|
|
- [ ] QA Team Lead
|
|
- [ ] Product Manager
|
|
|
|
**Implementation Approval (Phase 2 Go/No-Go):**
|
|
- [ ] Alpine CVE verification complete (Test 1 passed)
|
|
- [ ] PoC build successful (Test 3 passed)
|
|
- [ ] Rollback procedure validated
|
|
|
|
**Production Deployment Approval (Phase 6 Go/No-Go):**
|
|
- [ ] All blocking tests passed (Tests 5, 6, 8)
|
|
- [ ] Performance within acceptable range (<10% regression)
|
|
- [ ] Zero HIGH/CRITICAL CVEs (or documented risk acceptance)
|
|
- [ ] Staging deployment successful (48 hours stable)
|
|
|
|
---
|
|
|
|
**Document Status:** 📋 **DRAFT - AWAITING APPROVAL**
|
|
|
|
**Next Steps:**
|
|
1. Review this plan with Security Team (verify CVE research)
|
|
2. Obtain approvals from all stakeholders
|
|
3. Execute Phase 1 (CVE verification) - BLOCKING STEP
|
|
4. Schedule Phase 2 kickoff meeting (if Phase 1 successful)
|
|
|
|
**Estimated Start Date:** February 5, 2026 (pending approval)
|
|
**Estimated Completion Date:** March 5, 2026 (5 weeks total)
|