20 KiB
Active Issue: Creating a Proxy Host triggers Docker socket 500
Bug report: “When trying to create a new proxy host, connection to the local docker socket is giving a 500 error.”
Status: Trace analysis complete (no code changes in this phase)
Last updated: 2025-12-22
1) Trace Analysis (MANDATORY)
This workflow has two coupled request paths:
- Creating/saving the Proxy Host itself (
POST /api/v1/proxy-hosts). - Populating the “Containers” quick-select (Docker integration) used during Proxy Host creation (
GET /api/v1/docker/containers).
The reported 500 is thrown in (2), but it is experienced during the Proxy Host creation flow because the UI fetches containers from the local Docker socket when the user selects “Local (Docker Socket)”.
A) Frontend: UI entrypoint -> hooks
-
frontend/src/pages/ProxyHosts.tsx- Component:
ProxyHosts - Key functions:
handleAdd()setsshowForm=trueand clearseditingHost.handleSubmit(data: Partial<ProxyHost>)callscreateHost(data)(new host) orupdateHost(uuid, data)(edit).
- Renders
ProxyHostFormwhenshowFormis true.
- Component:
-
frontend/src/components/ProxyHostForm.tsx- Component:
ProxyHostForm({ host, onSubmit, onCancel }) - Default form state (
formData) is constructed with UI defaults (notably many booleans default totrue). - Docker quick-select integration:
- Local state:
connectionSourcedefaults to'custom'. - Hook call:
useDocker(connectionSource === 'local' ? 'local' : undefined, connectionSource !== 'local' && connectionSource !== 'custom' ? connectionSource : undefined)- When
connectionSourceis'local',useDocker(host='local', serverId=undefined). - When
connectionSourceis a remote server UUID,useDocker(host=undefined, serverId='<uuid>').
- Local state:
- Docker container select -> form transforms:
handleContainerSelect(containerId):- chooses
forward_hostandforward_portfrom containerip+private_port, or usesRemoteServer.host+ mappedpublic_portwhen a remote server source is selected. - auto-detects an
applicationpreset fromcontainer.image. - optionally auto-fills
domain_namesfrom a selected base domain.
- chooses
- Submit:
handleSubmit(e)buildspayloadWithoutUptimeand callsonSubmit(payloadWithoutUptime).
- Component:
-
frontend/src/hooks/useProxyHosts.ts- Hook:
useProxyHosts() createHostiscreateMutation.mutateAsyncwheremutationFn: (host) => createProxyHost(host).
- Hook:
-
frontend/src/hooks/useDocker.ts- Hook:
useDocker(host?: string | null, serverId?: string | null) - Uses React Query:
queryKey: ['docker-containers', host, serverId]queryFn: () => dockerApi.listContainers(host || undefined, serverId || undefined)retry: 1enabled: host !== null || serverId !== null- Important behavior: if both params are
undefined, this expression evaluates totrue(undefined !== null). - Result: the hook can still issue
GET /docker/containerseven whenconnectionSourceis'custom'(because the hook is called withundefined, undefined). - This is not necessarily the reported bug, but it is an observable logic hazard that increases the frequency of local Docker socket access.
- Important behavior: if both params are
- Hook:
B) Frontend: API client and payload shapes
-
frontend/src/api/client.ts- Axios instance with
baseURL: '/api/v1'. - All calls below are relative to
/api/v1.
- Axios instance with
-
frontend/src/api/proxyHosts.ts- Function:
createProxyHost(host: Partial<ProxyHost>)- Request:
POST /proxy-hosts - Payload shape (snake_case; subset of):
name: stringdomain_names: stringforward_scheme: stringforward_host: stringforward_port: numberssl_forced: booleanhttp2_support: booleanhsts_enabled: booleanhsts_subdomains: booleanblock_exploits: booleanwebsocket_support: booleanenable_standard_headers?: booleanapplication: 'none' | ...locations: Array<{ uuid?: string; path: string; forward_scheme: string; forward_host: string; forward_port: number }>advanced_config?: string(JSON string)enabled: booleancertificate_id?: number | nullaccess_list_id?: number | nullsecurity_header_profile_id?: number | null
- Response:
ProxyHost(same shape) from server.
- Request:
- Function:
-
frontend/src/api/docker.ts- Function:
dockerApi.listContainers(host?: string, serverId?: string)- Request:
GET /docker/containers - Query params:
host=<string>(e.g.,local) ORserver_id=<uuid>(remote server UUID)
- Response payload shape (array of
DockerContainer):id: stringnames: string[]image: stringstate: stringstatus: stringnetwork: stringip: stringports: Array<{ private_port: number; public_port: number; type: string }>
- Request:
- Function:
C) Backend: route definitions -> handlers
-
backend/internal/api/routes/routes.go- Route group base:
/api/v1.
Proxy Host routes:
- The
ProxyHostHandleris registered onapi(not theprotectedgroup):proxyHostHandler := handlers.NewProxyHostHandler(db, caddyManager, notificationService, uptimeService)proxyHostHandler.RegisterRoutes(api)
- Routes include:
POST /api/v1/proxy-hosts(create)- plus list/get/update/delete/test/bulk endpoints.
- Route group base:
C1) Auth/Authz: intended exposure of Proxy Host routes
The current route registration places Proxy Host routes on the unprotected api group (not the protected auth-required group).
-
Intended behavior (needs explicit confirmation): Proxy Host CRUD is accessible without auth.
-
If unintended: move
ProxyHostHandler.RegisterRoutes(...)under theprotectedgroup or enforce auth/authorization within the handler layer (deny-by-default). -
Either way: document the intended access model so the frontend and deployments can assume the correct security posture.
Docker routes:
- Docker routes are registered on
protected(auth-required) and only ifservices.NewDockerService()returnsnilerror:dockerService, err := services.NewDockerService()if err == nil { dockerHandler.RegisterRoutes(protected) }
- Key route:
GET /api/v1/docker/containers.
Clarification:
NewDockerService()success is a client construction success, not a reachability/health guarantee.- Result: the Docker endpoints may register at startup even when the Docker daemon/socket is unreachable, and failures will surface later per-request in
ListContainers.
- Docker routes are registered on
-
backend/internal/api/handlers/proxy_host_handler.go- Handler type:
ProxyHostHandler - Method:
Create(c *gin.Context)- Input binding:
c.ShouldBindJSON(&host)intomodels.ProxyHost. - Validations/transforms:
- If
host.advanced_config != "", it must parse as JSON; it is normalized viacaddy.NormalizeAdvancedConfigthen re-marshaled back to a JSON string. host.UUIDis generated server-side.- Each
host.locations[i].UUIDis generated server-side.
- If
- Persistence:
h.service.Create(&host). - Side effects:
- If
h.caddyManager != nil,ApplyConfig(ctx)is called; on error, it attempts rollback by deleting the created host. - Notification emit via
notificationService.SendExternal(...).
- If
- Response:
201with the persisted host JSON.
- Input binding:
- Handler type:
-
backend/internal/api/handlers/docker_handler.go- Handler type:DockerHandler- Method:ListContainers(c *gin.Context)- Reads query parameters: -host := c.Query("host")-serverID := c.Query("server_id")- Ifserver_idis provided: -remoteServerService.GetByUUID(serverID)- Constructs host:tcp://<server.Host>:<server.Port>- Calls:dockerService.ListContainers(ctx, host)- On error: - Returns500with JSON:{ "error": "Failed to list containers: <err>" }.Security note (SSRF/network scanning): the
hostquery param currently allows the caller to influence the Docker client target.- If
hostis accepted as an arbitrary value, this becomes an SSRF primitive (arbitrary outbound connections) and can be used for network scanning. - Preferred posture: do not accept user-supplied
hostfor remote selection; useserver_idas the only selector and resolve it server-side.
- If
D) Backend: services -> Docker client wrapper -> persistence
-
backend/internal/services/proxyhost_service.go- Service:ProxyHostService-Create(host *models.ProxyHost): - Validates domain uniqueness by exactdomain_namesstring match. - Normalizesadvanced_configagain (duplicates handler logic). - Persists viadb.Create(host). -
backend/internal/models/proxy_host.goandbackend/internal/models/location.go- Persistence model:models.ProxyHostwith snake_case JSON tags. - Related model:models.Location. -
backend/internal/services/docker_service.go- Wrapper:DockerService-NewDockerService(): - Creates Docker client viaclient.NewClientWithOpts(client.FromEnv, client.WithAPIVersionNegotiation()). - Important: this does not guarantee the daemon is reachable; it typically succeeds even if the socket is missing/unreachable, because it does not perform an API call. -ListContainers(ctx, host string): - Ifhost == ""orhost == "local": - uses the default client (local Docker socket via env defaults). - Else: - creates a new client withclient.WithHost(host)(e.g.,tcp://...). - Calls Docker API:cli.ContainerList(ctx, container.ListOptions{All: false}). - Maps Docker container data to[]DockerContainerresponse DTO (still local to the service file). -
backend/internal/services/remoteserver_service.goandbackend/internal/models/remote_server.go-RemoteServerService.GetByUUID(uuid)loadsmodels.RemoteServerused to build the remote Docker host string.
E) Where the 500 is likely being thrown (and why)
The reported 500 is thrown in:
backend/internal/api/handlers/docker_handler.goinListContainerswhendockerService.ListContainers(...)returns an error.
The most likely underlying causes for the error returned by DockerService.ListContainers in the “local” case are:
- Local socket missing (no Docker installed or not running):
unix:///var/run/docker.socknot present. - Socket permissions (common): process user is not in the
dockergroup, or the socket is root-only. - Rootless Docker: the daemon socket is under the user runtime dir (e.g.,
$XDG_RUNTIME_DIR/docker.sock) andclient.FromEnvisn’t pointing there. - Containerized deployment without mounting the Docker socket into Charon.
- Context timeout or daemon unresponsive.
Because the handler converts any Docker error into a generic 500, the UI sees it as an application failure rather than “Docker unavailable” / “permission denied”.
F) Explicit mismatch check: frontend vs backend payload expectations
This needs to distinguish two different “contracts”:
- Schema contract (wire format): The JSON/query parameter names and shapes align.
- Behavioral contract (when calls happen): The frontend can initiate Docker calls even when neither selector is set (both
hostandserverIdareundefined).
Answer:
-
Schema contract: No evidence of a mismatch for either call.
-
Behavioral contract: There is a mismatch/hazard in the frontend enablement condition that can produce calls with both selectors absent.
-
Proxy Host create:
- Frontend sends snake_case fields (e.g.,
domain_names,forward_port,security_header_profile_id). - Backend binds into
models.ProxyHostwhich uses matching snake_case JSON tags. - Evidence:
models.ProxyHostincludesjson:"domain_names",json:"forward_port", etc. - Note:
enable_standard_headersis a*boolin the backend model and a boolean-ish field in the frontend; JSONtrue/falsebinds correctly into*bool.
- Frontend sends snake_case fields (e.g.,
-
Docker list containers:
- Frontend sends query params
hostand/orserver_id. - Backend reads
hostandserver_idexactly. - Evidence:
dockerApi.listContainersconstructs{ host, server_id }, andDockerHandler.ListContainersreads those exact query keys.
- Frontend sends query params
Behavioral hazard detail:
- In
useDocker,enabled: host !== null || serverId !== nullevaluates totrueeven when both values areundefined. - Result: the frontend may call
GET /docker/containerswith neitherhostnorserver_idset (effectively “default/local”), even when the user selected “Custom / Manual”. - Recommendation: treat “no selectors” as disabled in the frontend, and consider a backend 400/validation guardrail if both are absent.
2) Reproduction & Observability
Local reproduction steps (UI)
- Start Charon and log in.
- Navigate to “Proxy Hosts”.
- Click “Add Proxy Host”.
- In the form, set “Source” to “Local (Docker Socket)”.
- Observe the Containers dropdown attempts to load.
API endpoint involved
GET /api/v1/docker/containers?host=local- (Triggered by the “Source: Local (Docker Socket)” selection.)
Expected vs actual
-
Expected:
- Containers list appears, allowing the user to pick a container and auto-fill forward host/port.
- If Docker is unavailable, the UI should show a clear “Docker unavailable” or “permission denied” message and not treat it as a generic server failure.
-
Actual:
- API responds
500with{"error":"Failed to list containers: ..."}. - UI shows “Failed to connect: ” under the Containers select when the source is not “Custom / Manual”.
- API responds
Where to look for logs
- Backend request logging middleware is enabled in
backend/cmd/api/main.go:router.Use(middleware.RequestID())router.Use(middleware.RequestLogger())router.Use(middleware.Recovery(cfg.Debug))- Expect to see request logs with status/latency for
/api/v1/docker/containers.
DockerHandler.ListContainerscurrently returns JSON errors but does not emit a structured log line for the underlying Docker error; only request logs will show the 500 unless the error causes a panic (unlikely).
3) Proposed Plan (after Trace Analysis)
Phased remediation with minimal changes, ordered for fastest user impact.
Phase 1: Make the UI stop calling Docker unless explicitly requested
- Files:
frontend/src/hooks/useDocker.ts- (Optional)
frontend/src/components/ProxyHostForm.tsx
- Intended changes (high level):
- Ensure the Docker containers query is disabled when no
hostand noserverIdare set. - Keep “Source: Custom / Manual” truly free of Docker calls.
- Ensure the Docker containers query is disabled when no
- Tests:
- Add/extend a frontend test to confirm no request is made when
hostandserverIdare bothundefined(the undefined/undefined case).
- Add/extend a frontend test to confirm no request is made when
Phase 2: Improve backend error mapping and message for Docker unavailability
- Files:
backend/internal/api/handlers/docker_handler.go- (Optional)
backend/internal/services/docker_service.go
- Intended changes (high level):
- Detect common Docker connectivity errors (socket missing, permission denied, daemon unreachable) and return a more accurate status (e.g.,
503 Service Unavailable) with a clearer message. - Add structured logging for the underlying error, including request_id.
- Security/SSRF hardening:
- Prefer
server_idas the only remote selector. - Remove
hostfrom the public API surface if feasible; if it must remain, restrict it strictly (e.g., allow onlylocaland/or a strict allow-list of configured endpoints). - Treat arbitrary
hostvalues as invalid input (deny-by-default) to prevent SSRF/network scanning.
- Prefer
- Detect common Docker connectivity errors (socket missing, permission denied, daemon unreachable) and return a more accurate status (e.g.,
- Tests:
- Introduce a small interface around DockerService (or a function injection) so
DockerHandlercan be unit-tested without a real Docker daemon. - Add unit tests in
backend/internal/api/handlers/docker_handler_test.gocovering:- local Docker unavailable -> 503
- invalid
server_id-> 404 - remote server host build -> correct host string
- selector validation: both
hostandserver_idabsent should be rejected if the backend adopts a stricter contract (recommended).
- Introduce a small interface around DockerService (or a function injection) so
Phase 3: Environment guidance and configuration surface
- Files:
docs/debugging-local-container.md(or another relevant doc page)- (Optional) backend config docs
- Intended changes (high level):
- Document how to mount
/var/run/docker.sockin containerized deployments. - Document rootless Docker socket path and
DOCKER_HOSTusage. - Provide a “Docker integration status” indicator in UI (optional, later).
- Document how to mount
4) Risks & Edge Cases
-
Docker socket permissions:
- On Linux,
/var/run/docker.sockis typically owned byroot:dockerand requires membership in thedockergroup. - In containers, the effective UID/GID and group mapping matters.
- On Linux,
-
Rootless Docker:
- Socket often at
unix:///run/user/<uid>/docker.sockand requiresDOCKER_HOSTto point there. - The current backend uses
client.FromEnv; ifDOCKER_HOSTis not set, it will default to the standard rootful socket path.
- Socket often at
-
Docker-in-Docker vs host socket mount:
- If Charon runs inside a container, Docker access requires either:
- mounting the host socket into the container, or
- running DinD and pointing
DOCKER_HOSTto it.
- If Charon runs inside a container, Docker access requires either:
-
Path differences:
/var/run/docker.sock(common) vs/run/docker.sock(symlinked on many distros) vs user socket paths.
-
Remote server scheme/transport mismatch:
DockerHandlerassumes TCP for remote Docker (tcp://host:port). If a remote server is configured but Docker only listens on a Unix socket or requires TLS, listing will fail.
-
Security considerations:
- SSRF/network scanning risk (high): if callers can control the Docker client target via
host, the system can be coerced into arbitrary outbound connections.- Mitigation: remove
hostfrom the public API or strict allow-listing only; preferserver_idas the only remote selector.
- Mitigation: remove
- Docker socket risk (high): mounting
/var/run/docker.sock(even as:ro) is effectively Docker-admin.- Rationale: many Docker API operations are possible via read endpoints that still grant sensitive access; and “read-only bind mount” does not prevent Docker API actions if the socket is reachable.
- Least-privilege deployment guidance: disable Docker integration unless needed, isolate Charon in a dedicated environment, avoid exposing remote Docker APIs publicly, and prefer restricted
server_id-based selection with strict auth.
- SSRF/network scanning risk (high): if callers can control the Docker client target via
5) Tests & Validation Requirements
Required tests (definition of done for the remediation work)
- Frontend:
- Add a test that asserts
useDocker(undefined, undefined)does not issue a request (the undefined/undefined case). - Ensure the UI “Custom / Manual” path does not fetch containers implicitly.
- Add a test that asserts
- Backend:
- Add handler unit tests for Docker routes using an injected/mocked docker service (no real Docker daemon required).
- Add tests for selector validation and for error mapping (e.g., unreachable/permission denied -> 503).
Task-based validation steps (run via VS Code tasks)
Test: Backend with CoverageTest: Frontend with CoverageLint: TypeScript CheckLint: Pre-commit (All Files)Security: Trivy ScanSecurity: Go Vulnerability Check