283 lines
10 KiB
Markdown
283 lines
10 KiB
Markdown
---
|
|
title: "Migration to Alpine (Issue #631)"
|
|
status: "draft"
|
|
scope: "docker/alpine-migration"
|
|
notes: This plan has yet to be finished. You may add to but, ** DO NOT ** overwrite until completion of PR #666.
|
|
---
|
|
|
|
## 1. Introduction
|
|
|
|
This plan defines the migration of the Charon Docker image base from
|
|
Debian Trixie Slim to Alpine Linux to address inherited glibc CVEs and
|
|
reduce image size (Issue #631). The plan consolidates the prior Alpine
|
|
migration research and translates it into a minimal-change, test-first
|
|
implementation path aligned with current CI and container workflows.
|
|
|
|
Objectives:
|
|
|
|
- Replace Debian-based runtime with Alpine 3.23.x while maintaining
|
|
feature parity.
|
|
- Eliminate Debian glibc HIGH CVEs in the runtime image.
|
|
- Keep build stages compatible with multi-arch Buildx and existing
|
|
supply chain checks.
|
|
- Validate DNS resolution, SQLite (CGO) behavior, and security suite
|
|
functionality under musl.
|
|
- Review and update .gitignore, codecov.yml, .dockerignore, and
|
|
Dockerfile as needed.
|
|
|
|
## 2. Research Findings
|
|
|
|
### 2.1 Existing Plans and Security Context
|
|
|
|
- Alpine migration specification already exists and is comprehensive:
|
|
docs/plans/alpine_migration_spec.md.
|
|
- Debian CVE acceptance is temporary and explicitly tied to Alpine
|
|
migration:
|
|
docs/security/VULNERABILITY_ACCEPTANCE.md.
|
|
- Past Alpine-related issues and trade-offs are documented, including
|
|
musl DNS differences:
|
|
docs/analysis/crowdsec_integration_failure_analysis.md.
|
|
|
|
### 2.2 Current Docker and CI Touchpoints
|
|
|
|
Primary files that must be considered for the migration:
|
|
|
|
- Dockerfile (multi-stage build with Debian runtime base).
|
|
- .docker/docker-entrypoint.sh (uses user/group management and tools
|
|
that differ on Alpine).
|
|
- .docker/compose/docker-compose.yml (image tag references).
|
|
- .github/workflows/docker-build.yml (base image digest resolution and
|
|
build args).
|
|
- .github/workflows/security-pr.yml and supply-chain-pr.yml (build and
|
|
scan behaviors depend on the container layout).
|
|
- tools/dockerfile_check.sh (package manager validation).
|
|
|
|
### 2.3 Compatibility Summary (musl vs glibc)
|
|
|
|
Based on alpine_migration_spec.md and current runtime behavior:
|
|
|
|
- Go services and Caddy/CrowdSec are Go binaries and compatible with
|
|
musl.
|
|
- SQLite is CGO-backed; ensure CGO remains enabled and libsqlite3 is
|
|
available under musl, then validate runtime CRUD behavior.
|
|
- DNS resolution differences are the primary operational risk;
|
|
mitigation is available via $GODEBUG=netdns=go.
|
|
- Entrypoint uses Debian-specific user/group tools; Alpine requires
|
|
adduser/addgroup or the shadow package.
|
|
|
|
## 3. Technical Specifications
|
|
|
|
### 3.1 Target Base Image
|
|
|
|
- Runtime base: alpine:3.23.x pinned by digest (Renovate-managed).
|
|
- Build stages: switch to alpine-based golang/node images where required
|
|
to use apk/xx-apk consistently.
|
|
- Build-stage images should be digest-pinned when feasible. If a digest
|
|
pin is not practical (e.g., multi-arch tag compatibility), document
|
|
the reason and keep the tag Renovate-managed.
|
|
|
|
### 3.2 Dockerfile Changes (Stage-by-Stage)
|
|
|
|
Stages and expected changes (paths and stage names are current):
|
|
|
|
1) gosu-builder (Dockerfile):
|
|
- Replace apt-get with apk.
|
|
- Replace xx-apt with xx-apk.
|
|
- Expected packages: git, clang, lld, gcc, musl-dev.
|
|
|
|
2) frontend-builder (Dockerfile):
|
|
- Use node:24.x-alpine.
|
|
- Keep npm_config_rollup_skip_nodejs_native settings for cross-arch
|
|
builds.
|
|
|
|
3) backend-builder (Dockerfile):
|
|
- Replace apt-get with apk.
|
|
- Replace xx-apt with xx-apk.
|
|
- Expected packages: clang, lld, gcc, musl-dev, sqlite-dev.
|
|
|
|
4) caddy-builder (Dockerfile):
|
|
- Replace apt-get with apk.
|
|
- Expected packages: git.
|
|
|
|
5) crowdsec-builder (Dockerfile):
|
|
- Replace apt-get with apk.
|
|
- Replace xx-apt with xx-apk.
|
|
- Expected packages: git, clang, lld, gcc, musl-dev.
|
|
|
|
6) crowdsec-fallback (Dockerfile):
|
|
- Replace debian:trixie-slim with alpine:3.23.x.
|
|
- Use apk add curl ca-certificates (tar is provided by busybox).
|
|
|
|
7) final runtime stage (Dockerfile):
|
|
- Replace CADDY_IMAGE base from Debian to Alpine.
|
|
- Replace apt-get with apk add.
|
|
- Runtime packages: bash, ca-certificates, sqlite-libs, sqlite,
|
|
tzdata, curl, gettext, libcap, c-ares, binutils, libc-utils
|
|
(for getent), busybox-extras or coreutils (for timeout),
|
|
libcap-utils (for setcap).
|
|
- Add ENV GODEBUG=netdns=go to mitigate musl DNS edge cases.
|
|
|
|
### 3.3 Entrypoint Adjustments
|
|
|
|
File: .docker/docker-entrypoint.sh
|
|
|
|
Functions and command usage that must be Alpine-safe:
|
|
|
|
- is_root(): no change.
|
|
- run_as_charon(): no change.
|
|
- Docker socket group handling:
|
|
- Replace groupadd/usermod with addgroup/adduser if shadow tools are
|
|
not installed.
|
|
- If using getent, ensure libc-utils is installed or implement a
|
|
/etc/group parsing fallback.
|
|
- CrowdSec initialization:
|
|
- Ensure sed -i usage is compatible with busybox sed.
|
|
- Verify timeout is available (busybox provides timeout).
|
|
|
|
### 3.4 CI and Workflow Updates
|
|
|
|
File: .github/workflows/docker-build.yml
|
|
|
|
- Replace "Resolve Debian base image digest" step to pull and resolve
|
|
alpine:3.23.x digest.
|
|
- Update CADDY_IMAGE build-arg to use the Alpine digest.
|
|
- Ensure buildx cache and tag logic remain unchanged.
|
|
|
|
No changes are expected to security-pr.yml and supply-chain-pr.yml
|
|
unless the container layout changes (paths used for binary extraction
|
|
and SBOM remain consistent).
|
|
|
|
### 3.5 Data Flow and Runtime Behavior
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
A[Docker Build] --> B[Multi-stage build on Alpine]
|
|
B --> C[Runtime: alpine base + charon + caddy + crowdsec]
|
|
C --> D[Entrypoint initializes volumes, CrowdSec, Caddy]
|
|
D --> E[Charon API + UI]
|
|
```
|
|
|
|
### 3.6 Requirements (EARS Notation)
|
|
|
|
- WHEN the Docker image is built, THE SYSTEM SHALL use Alpine 3.23.x
|
|
as the runtime base image.
|
|
- WHEN the container starts, THE SYSTEM SHALL create the charon user
|
|
and groups using Alpine-compatible tools.
|
|
- WHEN DNS resolution is performed, THE SYSTEM SHALL use the Go DNS
|
|
resolver to avoid musl NSS limitations.
|
|
- WHEN SQLite-backed operations run, THE SYSTEM SHALL read and write
|
|
data with CGO enabled and no schema errors under musl.
|
|
- IF Alpine package CVEs reappear at HIGH or CRITICAL, THEN THE SYSTEM
|
|
SHALL fail the security gate and block release.
|
|
|
|
## 4. Implementation Plan (Minimal-Request Phases)
|
|
|
|
### Phase 1: Playwright Tests (Behavior Baseline)
|
|
|
|
- Rebuild the E2E container when Docker build inputs change, then run
|
|
E2E smoke tests before any unit or integration tests to establish the
|
|
UI baseline (tests/). Focus on login, proxy host CRUD, security
|
|
toggles.
|
|
- Record baseline timings for key flows to compare after migration.
|
|
|
|
### Phase 2: Backend Implementation (Runtime and Container)
|
|
|
|
- Update Dockerfile stages to Alpine equivalents (see Section 3.2).
|
|
- Update .docker/docker-entrypoint.sh for Alpine user/group commands and
|
|
tool availability (see Section 3.3).
|
|
- Add ENV GODEBUG=netdns=go to Dockerfile runtime stage.
|
|
- Update tools/dockerfile_check.sh to validate apk and xx-apk usage in
|
|
Alpine-based stages, replacing any Debian-specific checks.
|
|
- Run tools/dockerfile_check.sh and capture results for apk/xx-apk
|
|
verification.
|
|
- Validate crowdsec and caddy binaries remain in the same paths:
|
|
/usr/bin/caddy, /usr/local/bin/crowdsec, /usr/local/bin/cscli.
|
|
|
|
### Phase 3: Frontend Implementation
|
|
|
|
- No application-level frontend changes expected.
|
|
- Ensure frontend build stage uses node:24.x-alpine in Dockerfile.
|
|
|
|
### Phase 4: Integration and Testing
|
|
|
|
- Rebuild E2E container and run Playwright suite (Docker mode).
|
|
- Run targeted integration tests:
|
|
- CrowdSec integration workflows.
|
|
- WAF and rate-limit workflows.
|
|
- Validate DNS challenges for at least one provider (Cloudflare).
|
|
- Validate SQLite CGO operations using health endpoints and basic CRUD.
|
|
- Validate multi-arch Buildx output and supply-chain workflows for the
|
|
Docker image:
|
|
- .github/workflows/docker-build.yml
|
|
- .github/workflows/security-pr.yml
|
|
- .github/workflows/supply-chain-pr.yml
|
|
- Run Trivy image scan and verify no HIGH/CRITICAL findings.
|
|
|
|
### Phase 5: Documentation and Deployment
|
|
|
|
- Update ARCHITECTURE.md to reflect Alpine base image.
|
|
- Update docs/security/VULNERABILITY_ACCEPTANCE.md to close the Debian
|
|
CVE acceptance and note Alpine status.
|
|
- Update any Docker guidance in README or .docker/README.md if it
|
|
references Debian.
|
|
|
|
## 5. Config Hygiene Review (Requested Files)
|
|
|
|
### 5.1 .gitignore
|
|
|
|
- No new ignore patterns required for Alpine migration.
|
|
- Verify no new build artifacts are introduced (apk cache is in-image
|
|
only).
|
|
|
|
### 5.2 .dockerignore
|
|
|
|
- No changes required; keep excluding docs and CI artifacts to minimize
|
|
build context size.
|
|
|
|
### 5.3 codecov.yml
|
|
|
|
- No changes required; migration does not add new code paths that should
|
|
be excluded from coverage.
|
|
|
|
### 5.4 Dockerfile (Required)
|
|
|
|
- Update base images and package manager usage per Section 3.2.
|
|
- Add GODEBUG=netdns=go in runtime stage.
|
|
- Replace useradd/groupadd with adduser/addgroup or add shadow tools if
|
|
preferred.
|
|
|
|
## 6. Acceptance Criteria
|
|
|
|
- The Docker image builds on Alpine with no build-stage failures.
|
|
- Runtime container starts with non-root user and no permission errors.
|
|
- All Playwright E2E tests pass against the Alpine-based container.
|
|
- Integration tests (CrowdSec, WAF, Rate Limit) pass without regressions.
|
|
- Trivy image scan reports zero HIGH/CRITICAL CVEs in the runtime image.
|
|
- tools/dockerfile_check.sh passes with apk and xx-apk checks for all
|
|
Alpine-based stages.
|
|
- Multi-arch Buildx validation succeeds and supply-chain workflows
|
|
(docker-build.yml, security-pr.yml, supply-chain-pr.yml) complete with
|
|
no regressions.
|
|
- ARCHITECTURE.md and security acceptance docs reflect Alpine as the
|
|
runtime base.
|
|
|
|
## 7. Risks and Mitigations
|
|
|
|
- Risk: musl DNS resolver differences cause ACME or webhook failures.
|
|
- Mitigation: set GODEBUG=netdns=go and run DNS provider tests.
|
|
|
|
- Risk: Alpine user/group tooling mismatch breaks Docker socket handling.
|
|
- Mitigation: adjust entrypoint to use adduser/addgroup or install
|
|
shadow tools and libc-utils for getent.
|
|
|
|
- Risk: SQLite CGO compatibility issues.
|
|
- Mitigation: run database integrity checks and CRUD tests.
|
|
|
|
## 8. Confidence Score
|
|
|
|
Confidence: 84 percent
|
|
|
|
Rationale: Alpine migration has a detailed existing spec and low code
|
|
surface change, but runtime differences (musl DNS, user/group tooling)
|
|
require careful validation.
|