Files
Charon/docs/plans/custom_dns_plugin_spec.md
GitHub Actions 8bcfe28709 docs: comprehensive supply chain security QA audit report
Complete security audit covering:
- CodeQL analysis (0 Critical/High issues)
- Trivy vulnerability scanning (clean)
- Shellcheck linting (2 issues fixed)
- Supply chain skill testing
- GitHub Actions workflow validation
- Regression testing

All critical checks PASSED. Ready for deployment.
2026-01-10 03:33:38 +00:00

74 KiB
Raw Blame History

Custom DNS Provider Plugin Support - Feature Specification

Status: 📋 Planning (Revised) Priority: P2 (Medium) Estimated Time: 48-68 hours Author: Planning Agent Date: January 8, 2026 Last Revised: January 8, 2026 Related: Phase 5 Custom Plugins Spec


1. Executive Summary

Problem Statement

Charon currently supports 10 built-in DNS providers for ACME DNS-01 challenges:

  • Cloudflare, Route53, DigitalOcean, Hetzner, DNSimple, Vultr, GoDaddy, Namecheap, Google Cloud DNS, Azure

Users with DNS services not on this list cannot obtain wildcard certificates or use DNS-01 challenges. This limitation affects:

  • Organizations using self-hosted DNS (BIND, PowerDNS, Knot DNS)
  • Users of regional/niche DNS providers
  • Enterprise environments with custom DNS APIs
  • Air-gapped or on-premise deployments

Proposed Solution

Implement multiple extensibility mechanisms that balance ease-of-use with flexibility:

Option Target User Complexity Automation Level
A: Webhook Plugin DevOps, Integration teams Medium Full
B: Script Plugin Sysadmins, Power users Low-Medium Full
C: RFC 2136 Plugin Self-hosted DNS admins Medium Full
D: Manual Plugin One-off certs, Testing None Manual

Success Criteria

  • Users can obtain certificates using any DNS provider
  • At least one plugin option is production-ready within 2 weeks
  • Existing built-in providers continue to work unchanged
  • 85% test coverage maintained

2. User Stories

2.1 Webhook Plugin (Option A)

As a DevOps engineer with a custom DNS API, I want to provide webhook endpoints so Charon can automate DNS challenges without building a custom integration.

Acceptance Criteria:

  • I can configure URLs for create/delete TXT record operations
  • Charon sends JSON payloads with record details
  • I can set custom headers for authentication
  • Retry logic handles temporary failures

2.2 Script Plugin (Option B)

As a system administrator, I want to run a shell script when Charon needs to create/delete TXT records so I can use my existing DNS automation tools.

Acceptance Criteria:

  • I can specify a script path inside the container
  • Script receives ACTION, DOMAIN, TOKEN, VALUE as arguments
  • Script exit code determines success/failure
  • Timeout prevents hung scripts

2.3 RFC 2136 Plugin (Option C)

As a network engineer running BIND or PowerDNS, I want to use RFC 2136 Dynamic DNS Updates so Charon integrates with my existing infrastructure.

Acceptance Criteria:

  • I can configure DNS server address and TSIG key
  • Charon sends standards-compliant UPDATE messages
  • Zone detection works automatically
  • Works with BIND9, PowerDNS, Knot DNS

2.4 Manual Plugin (Option D)

As a user with an unsupported provider, I want Charon to show me the required TXT record details so I can create it manually.

Acceptance Criteria:

  • UI clearly displays the record name and value
  • I can copy values with one click
  • "Verify" button checks if record exists
  • Progress indicator shows timeout countdown

2.5 General Stories

As an administrator, I want to see all available DNS provider types (built-in + custom) in a unified list.

As a security officer, I want custom plugin configurations to be validated and logged for audit purposes.


3. Architecture Analysis

3.1 Current Plugin System

Charon already has a well-designed plugin architecture in backend/pkg/dnsprovider/:

backend/pkg/dnsprovider/
├── plugin.go          # ProviderPlugin interface (13 methods)
├── registry.go        # Thread-safe registry (Global singleton)
├── errors.go          # Custom error types
└── builtin/
    ├── init.go        # Auto-registers 10 built-in providers
    ├── cloudflare.go  # Example: implements ProviderPlugin
    ├── route53.go
    └── ... (8 more providers)

Key Interface Methods:

type ProviderPlugin interface {
    Type() string
    Metadata() ProviderMetadata
    Init() error
    Cleanup() error
    RequiredCredentialFields() []CredentialFieldSpec
    OptionalCredentialFields() []CredentialFieldSpec
    ValidateCredentials(creds map[string]string) error
    TestCredentials(creds map[string]string) error
    SupportsMultiCredential() bool
    BuildCaddyConfig(creds map[string]string) map[string]any
    BuildCaddyConfigForZone(baseDomain string, creds map[string]string) map[string]any
    PropagationTimeout() time.Duration
    PollingInterval() time.Duration
}

3.2 How Custom Plugins Integrate

The existing architecture supports custom plugins via the registry pattern:

┌────────────────────────────────────────────────────────────────────┐
│                        DNS Provider Registry                        │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────────┐│
│ │ Cloudflare │ │  Route53   │ │ ... (8)    │ │  Custom Plugins    ││
│ │ (built-in) │ │ (built-in) │ │ (built-in) │ │ ┌────────────────┐ ││
│ └────────────┘ └────────────┘ └────────────┘ │ │ Webhook Plugin │ ││
│                                               │ ├────────────────┤ ││
│                                               │ │ Script Plugin  │ ││
│                                               │ ├────────────────┤ ││
│                                               │ │ RFC2136 Plugin │ ││
│                                               │ ├────────────────┤ ││
│                                               │ │ Manual Plugin  │ ││
│                                               │ └────────────────┘ ││
│                                               └────────────────────┘│
└────────────────────────────────────────────────────────────────────┘
                                    │
                    ┌───────────────┴───────────────┐
                    ▼                               ▼
          ┌─────────────────┐             ┌─────────────────┐
          │ DNS Provider    │             │ Caddy Config    │
          │ Service Layer   │             │ Builder         │
          │ (CRUD + Test)   │             │ (TLS Automation)│
          └─────────────────┘             └─────────────────┘

3.3 Caddy DNS Challenge Integration

Caddy's TLS automation supports custom DNS providers via its module system. For Options A, B, C, we need to either:

  1. Use Caddy's exec DNS provider - Caddy calls an external command
  2. Build a custom Caddy module - Complex, requires Caddy rebuild
  3. Use Charon as a DNS proxy - Charon handles DNS operations, returns status to Caddy

Recommended Approach: Option 3 (Charon as DNS proxy) for Webhook/Script plugins, native Caddy module for RFC 2136.

3.3.1 Charon DNS Proxy Architecture

For Webhook and Script plugins, Charon acts as a DNS challenge proxy between Caddy and the external DNS provider:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    DNS Challenge Flow (Webhook/Script)                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────┐  1. Certificate    ┌──────────┐  2. DNS-01 Challenge          │
│  │  Caddy   │  ──────────────▶  │   ACME   │  ◀─────────────────────        │
│  │  (TLS)   │                    │  Server  │                                │
│  └────┬─────┘                    └──────────┘                                │
│       │                                                                      │
│       │ 3. Create TXT record                                                 │
│       │    (via exec module or                                               │
│       │     internal API)                                                    │
│       ▼                                                                      │
│  ┌──────────┐  4. POST /internal/dns-challenge                               │
│  │  Charon  │  ─────────────────────────────────────────────────────────     │
│  │  (Proxy) │                                                                │
│  └────┬─────┘                                                                │
│       │                                                                      │
│       │ 5. Execute plugin (webhook/script)                                   │
│       ▼                                                                      │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                    External DNS Provider                              │   │
│  │  (Webhook endpoint or DNS server via script)                          │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.3.2 Challenge Lifecycle State Machine

                              ┌─────────────┐
                              │   CREATED   │
                              │  (initial)  │
                              └──────┬──────┘
                                     │
                          Plugin executes create
                                     │
                                     ▼
                              ┌─────────────┐
        ┌─────────────────────│   PENDING   │─────────────────────┐
        │                     │ (awaiting   │                     │
        │                     │ propagation)│                     │
        │                     └──────┬──────┘                     │
        │                            │                            │
   Timeout (10 min)          DNS check passes              Plugin error
        │                            │                            │
        ▼                            ▼                            ▼
 ┌─────────────┐            ┌─────────────┐              ┌─────────────┐
 │   EXPIRED   │            │  VERIFYING  │              │   FAILED    │
 │             │            │             │              │             │
 └─────────────┘            └──────┬──────┘              └─────────────┘
                                   │
                       ┌───────────┴───────────┐
                       │                       │
                  ACME success            ACME failure
                       │                       │
                       ▼                       ▼
                ┌─────────────┐         ┌─────────────┐
                │  VERIFIED   │         │   FAILED    │
                │  (success)  │         │             │
                └─────────────┘         └─────────────┘

State Definitions:

State Description Next States TTL
CREATED Challenge record created, plugin not yet executed PENDING, FAILED -
PENDING Plugin executed, waiting for DNS propagation VERIFYING, EXPIRED, FAILED 10 min
VERIFYING DNS record found, ACME validation in progress VERIFIED, FAILED 2 min
VERIFIED Challenge completed successfully (terminal) 24h cleanup
EXPIRED Timeout waiting for DNS propagation (terminal) 24h cleanup
FAILED Plugin error or ACME validation failure (terminal) 24h cleanup

3.3.3 Caddy Communication

Charon exposes an internal API for Caddy to delegate DNS challenge operations:

POST /internal/dns-challenge/create
{
  "provider_id": "uuid",
  "fqdn": "_acme-challenge.example.com",
  "value": "token-value"
}
Response: {"challenge_id": "uuid", "status": "pending"}

DELETE /internal/dns-challenge/{challenge_id}
Response: {"status": "deleted"}

3.3.4 Error Handling When Charon is Unavailable

If Charon is unavailable during a DNS challenge:

  1. Caddy retry: Caddy's built-in retry mechanism (3 attempts, exponential backoff)
  2. Graceful degradation: If Charon remains unavailable, Caddy logs error and fails certificate issuance
  3. Health check: Caddy pre-checks Charon availability via /health before initiating challenges
  4. Circuit breaker: After 5 consecutive failures, Caddy disables the custom provider for 5 minutes

3.4 Database Model Impact

Current dns_providers table schema:

CREATE TABLE dns_providers (
    id INTEGER PRIMARY KEY,
    uuid VARCHAR(36) UNIQUE,
    name VARCHAR(255) NOT NULL,
    provider_type VARCHAR(50) NOT NULL,     -- 'cloudflare', 'webhook', 'script', etc.
    enabled BOOLEAN DEFAULT TRUE,
    is_default BOOLEAN DEFAULT FALSE,
    credentials_encrypted TEXT,              -- Encrypted JSON blob
    key_version INTEGER DEFAULT 1,
    propagation_timeout INTEGER DEFAULT 120,
    polling_interval INTEGER DEFAULT 5,
    -- ... statistics fields
);

Custom plugins will use the same table with different provider_type values and plugin-specific credentials.


4. Proposed Solutions

4.1 Option A: Generic Webhook Plugin

Overview

User provides webhook URLs for create/delete TXT records. Charon POSTs JSON payloads with record details.

Configuration

{
  "name": "My Webhook DNS",
  "provider_type": "webhook",
  "credentials": {
    "create_url": "https://api.example.com/dns/txt/create",
    "delete_url": "https://api.example.com/dns/txt/delete",
    "auth_header": "X-API-Key",
    "auth_value": "secret-token-here",
    "timeout_seconds": "30",
    "retry_count": "3"
  }
}

Request Payload (Sent to Webhook)

{
  "action": "create",
  "fqdn": "_acme-challenge.example.com",
  "domain": "example.com",
  "subdomain": "_acme-challenge",
  "value": "gZrH7wL9t3kM2nP4...",
  "ttl": 300,
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-08T15:30:00Z"
}

Expected Response

{
  "success": true,
  "message": "TXT record created",
  "record_id": "optional-id-for-deletion"
}

Rate Limiting and Circuit Breaker

To prevent abuse and ensure reliability, webhook plugins enforce:

Limit Value Behavior
Max calls per minute 10 Requests beyond limit return 429 Too Many Requests
Circuit breaker threshold 5 consecutive failures Provider disabled for 5 minutes
Circuit breaker reset Automatic after 5 minutes First successful call fully resets counter

Implementation Requirements:

type WebhookRateLimiter struct {
    callsPerMinute    int           // Max 10
    consecutiveFails  int           // Track failures
    disabledUntil     time.Time     // Circuit breaker timestamp
}

func (w *WebhookProvider) executeWithRateLimit(ctx context.Context, req *WebhookRequest) error {
    if time.Now().Before(w.rateLimiter.disabledUntil) {
        return ErrProviderCircuitOpen
    }
    // ... execute webhook with rate limiting
}

Pros

  • Works with any HTTP-capable system
  • No code changes required on user side (just API endpoint)
  • Supports complex authentication (headers, query params)
  • Can integrate with existing automation (Terraform, Ansible AWX, etc.)

Cons

  • User must implement and host webhook endpoint
  • Network latency adds to propagation time
  • Debugging requires access to both Charon and webhook logs
  • Security: webhook credentials stored in Charon

Implementation Complexity

  • Backend: ~200 lines (WebhookProvider implementation)
  • Frontend: ~100 lines (form fields)
  • Tests: ~150 lines

4.2 Option B: Custom Script Plugin

Overview

User provides path to shell script inside container. Script receives ACTION, DOMAIN, TOKEN, VALUE as arguments.

Configuration

{
  "name": "My Script DNS",
  "provider_type": "script",
  "credentials": {
    "script_path": "/scripts/dns-update.sh",
    "timeout_seconds": "60",
    "env_vars": "DNS_SERVER=ns1.example.com,API_KEY=${API_KEY}"
  }
}

Script Interface

#!/bin/bash
# Called by Charon for DNS-01 challenge
# Arguments:
#   $1 = ACTION: "create" or "delete"
#   $2 = FQDN: "_acme-challenge.example.com"
#   $3 = TOKEN: Challenge token (for identification)
#   $4 = VALUE: TXT record value to set

ACTION="$1"
FQDN="$2"
TOKEN="$3"
VALUE="$4"

case "$ACTION" in
  create)
    # Create TXT record
    nsupdate <<EOF
server ${DNS_SERVER}
update add ${FQDN} 300 TXT "${VALUE}"
send
EOF
    ;;
  delete)
    # Delete TXT record
    nsupdate <<EOF
server ${DNS_SERVER}
update delete ${FQDN} TXT
send
EOF
    ;;
esac

# Exit code: 0 = success, non-zero = failure

Pros

  • Maximum flexibility - any tool/language can be used
  • Direct access to host system (if volume-mounted)
  • Familiar paradigm for sysadmins
  • Can leverage existing scripts/tooling

Cons

  • Security Risk: Script execution in container context
  • Harder to debug than API calls
  • Script must be mounted into container
  • No automatic retries (must implement in script)
  • Sandboxing limits capability

Security Mitigations

  1. Script must be in allowlisted directory (/scripts/)
  2. Scripts run with restricted permissions (no network by default)
  3. Timeout prevents resource exhaustion
  4. All executions are audit-logged

Implementation Complexity

  • Backend: ~250 lines (ScriptProvider + executor)
  • Frontend: ~80 lines (form fields)
  • Tests: ~200 lines (including security tests)

4.3 Option C: RFC 2136 (Dynamic DNS Update) Plugin

Overview

RFC 2136 defines a standard protocol for dynamic DNS updates. Supported by BIND, PowerDNS, Knot DNS, and many self-hosted DNS servers.

Configuration

{
  "name": "My BIND Server",
  "provider_type": "rfc2136",
  "credentials": {
    "nameserver": "ns1.example.com",
    "port": "53",
    "tsig_key_name": "acme-update-key",
    "tsig_key_secret": "base64-encoded-secret",
    "tsig_algorithm": "hmac-sha256",
    "zone": "example.com"
  }
}

TSIG Algorithms Supported

  • hmac-md5 (legacy)
  • hmac-sha1
  • hmac-sha256 (recommended)
  • hmac-sha384
  • hmac-sha512

DNS UPDATE Message Flow

┌──────────┐                    ┌──────────────┐
│  Charon  │                    │  DNS Server  │
│          │  DNS UPDATE        │  (BIND, etc) │
│          │  ─────────────────▶│              │
│          │  TSIG-signed       │              │
│          │                    │              │
│          │  RESPONSE          │              │
│          │  ◀─────────────────│              │
│          │  NOERROR/REFUSED   │              │
└──────────┘                    └──────────────┘

Caddy Integration

Caddy has a native RFC 2136 module: caddy-dns/rfc2136

DECISION: Charon WILL ship with the RFC 2136 Caddy module pre-built in the Docker image. Users do NOT need to rebuild Caddy.

The Charon plugin would:

  1. Store TSIG credentials encrypted
  2. Generate Caddy config with proper RFC 2136 settings
  3. Validate credentials by attempting a test query

Dockerfile Addition (Phase 2):

# Build Caddy with RFC 2136 module
FROM caddy:builder AS caddy-builder
RUN xcaddy build \
    --with github.com/caddy-dns/rfc2136

Pros

  • Industry-standard protocol
  • No custom server-side code needed
  • Works with popular DNS servers (BIND9, PowerDNS, Knot)
  • Secure with TSIG authentication
  • Native Caddy module available

Cons

  • Requires DNS server configuration for TSIG keys
  • More complex setup than webhook
  • Zone configuration required
  • Firewall rules may need updating (TCP/UDP 53)

Implementation Complexity

  • Backend: ~180 lines (RFC2136Provider)
  • Frontend: ~120 lines (TSIG configuration form)
  • Tests: ~150 lines
  • Requires: Caddy rebuild with caddy-dns/rfc2136 module

4.4 Option D: Manual/External Plugin

Overview

No automation - UI shows required TXT record details, user creates manually, clicks "Verify" when done.

UI Flow

┌─────────────────────────────────────────────────────────────────────┐
│                    Manual DNS Challenge                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  To obtain a certificate for *.example.com, create the following    │
│  TXT record at your DNS provider:                                   │
│                                                                      │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │  Record Name:  _acme-challenge.example.com         [📋 Copy]  │ │
│  ├────────────────────────────────────────────────────────────────┤ │
│  │  Record Value: gZrH7wL9t3kM2nP4qX5yR8sT...         [📋 Copy]  │ │
│  ├────────────────────────────────────────────────────────────────┤ │
│  │  TTL: 300 (5 minutes)                                          │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                                                                      │
│  ⏱️ Time remaining: 4:32                                             │
│  [━━━━━━━━━━━━━━━━━━━━━░░░░░░░░░░] 68%                              │
│                                                                      │
│  [Check DNS Now]  [I've Created the Record - Verify]                │
│                                                                      │
│   Record not yet propagated. Last check: 10 seconds ago            │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Configuration

{
  "name": "Manual DNS",
  "provider_type": "manual",
  "credentials": {
    "timeout_minutes": "10",
    "polling_interval_seconds": "30"
  }
}

Technical Implementation

  • Store challenge details in session/database
  • Background job periodically queries DNS
  • Polling endpoint for UI updates (10-second interval)
  • Timeout after configurable period

Note: Although Charon has existing WebSocket infrastructure (backend/internal/services/websocket_tracker.go), polling is chosen for simplicity:

  • Avoids additional WebSocket connection management complexity
  • 10-second polling interval provides acceptable UX for manual workflows
  • Reduces frontend state management burden

Polling Endpoint:

GET /api/v1/dns-providers/:id/manual-challenge/:challengeId/poll
Response (every 10s):
{
  "status": "pending|verified|expired|failed",
  "dns_propagated": false,
  "time_remaining_seconds": 432,
  "last_check_at": "2026-01-08T15:35:00Z"
}

Pros

  • Works with ANY DNS provider
  • No integration required
  • Good for testing/development
  • One-off certificate issuance

Cons

  • User must manually intervene
  • Time-sensitive (ACME challenge timeout)
  • Not suitable for automated renewals
  • Doesn't scale for multiple certificates

Implementation Complexity

  • Backend: ~150 lines (ManualProvider + verification endpoint)
  • Frontend: ~300 lines (interactive UI with copy/verify)
  • Tests: ~100 lines

Phase 1: Manual Plugin (1 week)

Rationale: Unblocks all users immediately. Lowest risk, highest immediate value.

Deliverables:

  • ManualProvider implementation
  • Interactive challenge UI
  • DNS verification endpoint
  • User documentation

Phase 2: RFC 2136 Plugin (1 week)

Rationale: Standards-based, serves self-hosted DNS users. Caddy module already exists.

Deliverables:

  • RFC2136Provider implementation
  • TSIG credential storage
  • Caddy module integration documentation
  • BIND9/PowerDNS setup guides

Phase 3: Webhook Plugin (1 week)

Rationale: Most flexible option for custom integrations. Medium complexity.

Deliverables:

  • WebhookProvider implementation
  • Configurable retry logic
  • Request/response logging
  • Example webhook implementations (Node.js, Python)

Future Work

Phase 4: Script Plugin (Conditional)

Go/No-Go Gate: Phase 4 only proceeds if >20 user requests are received via GitHub issues requesting script plugin functionality. Track via label feature:script-plugin.

Rationale: Power-user feature with significant security implications. Implement only if demand warrants the additional security review and maintenance burden.

Deliverables:

  • ScriptProvider implementation
  • Security sandbox
  • Example scripts for common scenarios

Implementation Order Justification

User Value
    │
    │  ★ Manual Plugin (Phase 1)
    │    - Unblocks everyone immediately
    │    - Lowest implementation risk
    │
    │  ★ RFC 2136 Plugin (Phase 2)
    │    - Self-hosted DNS is common need
    │    - Industry standard
    │
    │  ★ Webhook Plugin (Phase 3)
    │    - Flexible for edge cases
    │    - Integration-focused teams
    │
    │  ○ Script Plugin (Phase 4)
    │    - Power users only
    │    - Security concerns
    │
    └────────────────────────────────▶ Implementation Effort

6. Database Schema Changes

6.1 No New Tables Required

The existing dns_providers table schema supports custom plugins. The provider_type column accepts new values, and credentials_encrypted stores plugin-specific configuration.

6.2 Provider Type Enumeration

Expand the allowed provider_type values:

// backend/pkg/dnsprovider/types.go
const (
    // Built-in providers
    TypeCloudflare    = "cloudflare"
    TypeRoute53       = "route53"
    // ... existing providers

    // Custom plugins
    TypeWebhook       = "webhook"
    TypeScript        = "script"
    TypeRFC2136       = "rfc2136"
    TypeManual        = "manual"
)

6.3 Credential Schemas Per Plugin Type

Webhook Credentials

{
  "create_url": "string (required)",
  "delete_url": "string (required)",
  "auth_header": "string (optional)",
  "auth_value": "string (optional, encrypted)",
  "content_type": "string (default: application/json)",
  "timeout_seconds": "integer (default: 30)",
  "retry_count": "integer (default: 3)",
  "custom_headers": "object (optional)"
}

Script Credentials

{
  "script_path": "string (required)",
  "timeout_seconds": "integer (default: 60)",
  "working_directory": "string (optional)",
  "env_vars": "string (optional, KEY=VALUE format)"
}

RFC 2136 Credentials

{
  "nameserver": "string (required)",
  "port": "integer (default: 53)",
  "tsig_key_name": "string (required)",
  "tsig_key_secret": "string (required, encrypted)",
  "tsig_algorithm": "string (default: hmac-sha256)",
  "zone": "string (optional, auto-detect)"
}

Manual Credentials

{
  "timeout_minutes": "integer (default: 10)",
  "polling_interval_seconds": "integer (default: 30)"
}

6.4 Challenge Cleanup Mechanism

Challenges are cleaned up via Charon's existing scheduled task infrastructure (using robfig/cron/v3, same pattern as backup_service.go):

// Cleanup job runs hourly
func (s *ManualChallengeService) scheduleCleanup() {
    _, err := s.cron.AddFunc("0 * * * *", s.cleanupExpiredChallenges)
    // ...
}

func (s *ManualChallengeService) cleanupExpiredChallenges() {
    // Mark challenges in "pending" state > 24 hours as "expired"
    // Delete challenge records > 7 days old
    cutoff := time.Now().Add(-24 * time.Hour)
    s.db.Model(&Challenge{}).
        Where("status = ? AND created_at < ?", "pending", cutoff).
        Update("status", "expired")

    // Hard delete after 7 days
    deleteCutoff := time.Now().Add(-7 * 24 * time.Hour)
    s.db.Where("created_at < ?", deleteCutoff).Delete(&Challenge{})
}

Cleanup Schedule:

Condition Action Frequency
pending status > 24 hours Mark as expired Hourly
Any challenge > 7 days old Hard delete Hourly

7. API Design

7.1 Existing Endpoints (No Changes)

Method Endpoint Description
GET /api/v1/dns-providers List all providers
POST /api/v1/dns-providers Create provider
GET /api/v1/dns-providers/:id Get provider
PUT /api/v1/dns-providers/:id Update provider
DELETE /api/v1/dns-providers/:id Delete provider
POST /api/v1/dns-providers/:id/test Test credentials
GET /api/v1/dns-providers/types List provider types

7.2 New Endpoints

Manual Challenge Status

GET /api/v1/dns-providers/:id/manual-challenge/:challengeId

Response:

{
  "id": "challenge-uuid",
  "status": "pending|verified|expired|failed",
  "fqdn": "_acme-challenge.example.com",
  "value": "gZrH7wL9t3kM2nP4...",
  "created_at": "2026-01-08T15:30:00Z",
  "expires_at": "2026-01-08T15:40:00Z",
  "last_check_at": "2026-01-08T15:35:00Z",
  "dns_propagated": false
}

Manual Challenge Verification Trigger

POST /api/v1/dns-providers/:id/manual-challenge/:challengeId/verify

Response:

{
  "success": true,
  "dns_found": true,
  "message": "TXT record verified successfully"
}

7.3 Error Response Codes

All manual challenge and custom plugin endpoints use consistent error codes:

Error Code HTTP Status Description
CHALLENGE_NOT_FOUND 404 Challenge ID does not exist
CHALLENGE_EXPIRED 410 Challenge has timed out
DNS_NOT_PROPAGATED 200 DNS record not yet found (success: false)
INVALID_PROVIDER_TYPE 400 Unknown provider type
WEBHOOK_TIMEOUT 504 Webhook did not respond in time
WEBHOOK_RATE_LIMITED 429 Too many webhook calls (>10/min)
PROVIDER_CIRCUIT_OPEN 503 Provider disabled due to consecutive failures
SCRIPT_TIMEOUT 504 Script execution exceeded timeout
SCRIPT_PATH_INVALID 400 Script path not in allowed directory
TSIG_AUTH_FAILED 401 RFC 2136 TSIG authentication failed

Error Response Format:

{
  "success": false,
  "error": {
    "code": "CHALLENGE_EXPIRED",
    "message": "Challenge timed out after 10 minutes",
    "details": {
      "challenge_id": "uuid",
      "expired_at": "2026-01-08T15:40:00Z"
    }
  }
}

7.4 Updated Types Endpoint Response

The existing /api/v1/dns-providers/types endpoint will include custom plugins:

{
  "types": [
    {
      "type": "cloudflare",
      "name": "Cloudflare",
      "is_built_in": true,
      "fields": [...]
    },
    {
      "type": "webhook",
      "name": "Webhook (Generic)",
      "is_built_in": false,
      "category": "custom",
      "fields": [
        {"name": "create_url", "label": "Create Record URL", "type": "text", "required": true},
        {"name": "delete_url", "label": "Delete Record URL", "type": "text", "required": true},
        {"name": "auth_header", "label": "Auth Header Name", "type": "text", "required": false},
        {"name": "auth_value", "label": "Auth Header Value", "type": "password", "required": false}
      ]
    },
    {
      "type": "rfc2136",
      "name": "RFC 2136 (Dynamic DNS)",
      "is_built_in": false,
      "category": "custom",
      "fields": [
        {"name": "nameserver", "label": "DNS Server", "type": "text", "required": true},
        {"name": "tsig_key_name", "label": "TSIG Key Name", "type": "text", "required": true},
        {"name": "tsig_key_secret", "label": "TSIG Secret", "type": "password", "required": true},
        {"name": "tsig_algorithm", "label": "TSIG Algorithm", "type": "select", "options": [...]}
      ]
    },
    {
      "type": "manual",
      "name": "Manual (No Automation)",
      "is_built_in": false,
      "category": "custom",
      "fields": [
        {"name": "timeout_minutes", "label": "Challenge Timeout (minutes)", "type": "number", "default": "10"}
      ]
    }
  ]
}

8. Frontend UI Mockups

8.1 Provider Type Selection (Updated)

┌─────────────────────────────────────────────────────────────────────┐
│                     Add DNS Provider                                 │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Select Provider Type:                                               │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ BUILT-IN PROVIDERS                                              ││
│  │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐││
│  │ │ ☁️ Cloudflare│ │ 🔶 Route53  │ │ 💧 Digital  │ │ 🔷 Azure    │││
│  │ │             │ │             │ │    Ocean    │ │             │││
│  │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘││
│  │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐││
│  │ │ 🌐 Google   │ │ 🟠 Hetzner  │ │ 📛 GoDaddy  │ │ 🔵 Namecheap│││
│  │ │   Cloud DNS │ │             │ │             │ │             │││
│  │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ CUSTOM INTEGRATIONS                                             ││
│  │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐││
│  │ │ 🔗 Webhook  │ │ 📜 Script   │ │ 📡 RFC 2136 │ │ ✋ Manual   │││
│  │ │   (HTTP)    │ │   (Shell)   │ │   (DDNS)    │ │             │││
│  │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│                                           [Cancel]  [Next →]         │
└─────────────────────────────────────────────────────────────────────┘

8.2 Webhook Configuration Form

┌─────────────────────────────────────────────────────────────────────┐
│                   Configure Webhook Provider                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Provider Name:                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ My Custom DNS Webhook                                           ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  Create Record URL: *                                                │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ https://api.example.com/dns/create                              ││
│  └─────────────────────────────────────────────────────────────────┘│
│   Charon will POST JSON with record details                        │
│                                                                      │
│  Delete Record URL: *                                                │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ https://api.example.com/dns/delete                              ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  ── Authentication (Optional) ──────────────────────────────────────│
│                                                                      │
│  Header Name:                    Header Value:                       │
│  ┌───────────────────┐          ┌───────────────────────────────┐  │
│  │ X-API-Key         │          │ ••••••••••••••                │  │
│  └───────────────────┘          └───────────────────────────────┘  │
│                                                                      │
│  ── Advanced Settings ──────────────────────────────────────────────│
│                                                                      │
│  Timeout (seconds):  [30 ▼]    Retry Count:  [3 ▼]                  │
│                                                                      │
│                                                                      │
│         [Test Connection]        [Cancel]  [Save Provider]           │
└─────────────────────────────────────────────────────────────────────┘

8.3 RFC 2136 Configuration Form

┌─────────────────────────────────────────────────────────────────────┐
│                   Configure RFC 2136 Provider                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Provider Name:                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Internal BIND Server                                            ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  DNS Server: *                          Port:                        │
│  ┌─────────────────────────────────────┐ ┌─────────────────────────┐│
│  │ ns1.internal.example.com            │ │ 53                      ││
│  └─────────────────────────────────────┘ └─────────────────────────┘│
│                                                                      │
│  ── TSIG Authentication ────────────────────────────────────────────│
│                                                                      │
│  Key Name: *                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ acme-update-key.example.com                                     ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  Key Secret: *                                                       │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ ••••••••••••••••••••••••••••••••                                ││
│  └─────────────────────────────────────────────────────────────────┘│
│   Base64-encoded TSIG secret                                       │
│                                                                      │
│  Algorithm:                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ HMAC-SHA256 (Recommended)                                    ▼ ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  Zone (optional - auto-detected if empty):                          │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │                                                                 ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│         [Test Connection]        [Cancel]  [Save Provider]           │
└─────────────────────────────────────────────────────────────────────┘

8.4 Manual Challenge UI

┌─────────────────────────────────────────────────────────────────────┐
│                   🔐 Manual DNS Challenge                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Certificate Request: *.example.com                                  │
│  Provider: Manual DNS (example-manual)                               │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │  📋 CREATE THIS TXT RECORD AT YOUR DNS PROVIDER                 ││
│  │                                                                  ││
│  │  Record Name:                                                    ││
│  │  ┌──────────────────────────────────────────────────┐  ┌──────┐││
│  │  │ _acme-challenge.example.com                      │  │ Copy │││
│  │  └──────────────────────────────────────────────────┘  └──────┘││
│  │                                                                  ││
│  │  Record Type: TXT                                                ││
│  │                                                                  ││
│  │  Record Value:                                                   ││
│  │  ┌──────────────────────────────────────────────────┐  ┌──────┐││
│  │  │ gZrH7wL9t3kM2nP4qX5yR8sT0uV1wZ2aB3cD4eF5gH6iJ7  │  │ Copy │││
│  │  └──────────────────────────────────────────────────┘  └──────┘││
│  │                                                                  ││
│  │  TTL: 300 seconds (5 minutes)                                   ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │  ⏱️ Time Remaining: 7:23                                         ││
│  │  [━━━━━━━━━━━━━━━━━░░░░░░░░░░░░░░░] 52%                         ││
│  └─────────────────────────────────────────────────────────────────┘│
│                                                                      │
│  Status: ⏳ Waiting for DNS propagation...                           │
│  Last checked: 15 seconds ago                                        │
│                                                                      │
│  ┌─────────────────────┐  ┌────────────────────────────────────────┐│
│  │  🔍 Check DNS Now   │  │  ✅ I've Created the Record - Verify   ││
│  └─────────────────────┘  └────────────────────────────────────────┘│
│                                                                      │
│                                           [Cancel Challenge]         │
└─────────────────────────────────────────────────────────────────────┘

9. Security Considerations

9.1 Threat Model

Threat Risk Level Mitigation
Credential theft from database High AES-256-GCM encryption at rest, key rotation
Webhook URL SSRF High URL validation, internal IP blocking
Script path traversal Critical Allowlist /scripts/ directory only
Script command injection Critical Sanitize all arguments, no shell expansion
TSIG key exposure in logs Medium Redact secrets in all logs
DNS cache poisoning Low TSIG authentication for RFC 2136
Webhook response injection Low Strict JSON parsing, no eval

9.2 SSRF Prevention for Webhooks

Webhook URL validation MUST use Charon's existing centralized SSRF protection in backend/internal/security/url_validator.go:

// backend/internal/services/webhook_provider.go
import "github.com/Wikid82/charon/backend/internal/security"

func (w *WebhookProvider) validateWebhookURL(urlStr string) error {
    // Use existing centralized SSRF validation
    // This validates:
    // - HTTPS scheme required (production)
    // - DNS resolution with timeout
    // - All resolved IPs checked against private/reserved ranges
    // - Cloud metadata endpoints blocked (169.254.169.254)
    // - IPv4-mapped IPv6 bypass prevention
    _, err := security.ValidateExternalURL(urlStr)
    if err != nil {
        return fmt.Errorf("webhook URL validation failed: %w", err)
    }
    return nil
}

Existing security.ValidateExternalURL() provides:

  • RFC 1918 private network blocking (10.x, 172.16.x, 192.168.x)
  • Loopback blocking (127.x.x.x, ::1) unless WithAllowLocalhost() option
  • Link-local blocking (169.254.x.x, fe80::) including cloud metadata
  • Reserved range blocking (0.x.x.x, 240.x.x.x)
  • IPv6 unique local blocking (fc00::)
  • IPv4-mapped IPv6 bypass prevention (::ffff:192.168.1.1)
  • Hostname length validation (RFC 1035, max 253 chars)
  • Suspicious pattern detection (..)
  • Port range validation with privileged port blocking

DO NOT duplicate SSRF validation logic. Reference the existing implementation.


### 9.3 Script Execution Security

```go
// backend/internal/services/script_provider.go
import (
    "context"
    "os/exec"
    "syscall"
)

func executeScript(scriptPath string, args []string) error {
    // 1. Validate script path
    allowedDir := "/scripts/"
    absPath, _ := filepath.Abs(scriptPath)
    if !strings.HasPrefix(absPath, allowedDir) {
        return errors.New("script must be in /scripts/ directory")
    }

    // 2. Verify script exists and is executable
    info, err := os.Stat(absPath)
    if err != nil || info.IsDir() {
        return errors.New("invalid script path")
    }

    // 3. Create restricted command with timeout wrapper (defense-in-depth)
    ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
    defer cancel()

    // Use 'timeout' command as additional safeguard against hung processes
    cmd := exec.CommandContext(ctx, "timeout", "--signal=KILL", "55s", absPath)
    cmd.Args = append(cmd.Args, args...)
    cmd.Dir = allowedDir

    // 4. Minimal but functional environment
    cmd.Env = []string{
        "PATH=/usr/local/bin:/usr/bin:/bin",
        "HOME=/tmp",
        "LANG=C.UTF-8",
    }

    // 5. Resource limits via rlimit (prevents resource exhaustion)
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Credential: &syscall.Credential{
            Uid: 65534, // nobody user
            Gid: 65534,
        },
    }

    // Apply resource limits
    setResourceLimits(cmd)

    // 6. Capture output for logging
    output, err := cmd.CombinedOutput()

    // 7. Audit log
    logScriptExecution(scriptPath, args, cmd.ProcessState.ExitCode(), output)

    return err
}

// setResourceLimits applies rlimits to prevent resource exhaustion
// Note: These are set via prlimit(2) or container security context
func setResourceLimits(cmd *exec.Cmd) {
    // RLIMIT_NOFILE: Max open file descriptors (prevent fd exhaustion)
    // RLIMIT_NPROC: Max processes (prevent fork bombs)
    // RLIMIT_AS: Max address space (prevent memory exhaustion)
    //
    // Recommended values:
    // - NOFILE: 256
    // - NPROC: 64
    // - AS: 256MB
    //
    // Implementation note: In containerized deployments, these limits
    // should be enforced via container security context (securityContext
    // in Kubernetes, --ulimit in Docker) for stronger isolation.
}

Security Layers (Defense-in-Depth):

Layer Protection Implementation
1. Path validation Restrict to /scripts/ filepath.Abs() + prefix check
2. Timeout Prevent hung scripts context.WithTimeout + timeout command
3. Resource limits Prevent resource exhaustion rlimit (NOFILE=256, NPROC=64, AS=256MB)
4. Minimal environment Reduce attack surface Explicit PATH, no secrets
5. Non-root execution Limit privilege nobody user (UID 65534)
6. Container isolation Strongest isolation seccomp profile (see below)
7. Audit logging Forensics All executions logged

Container Security (seccomp profile):

For production deployments, scripts run within Charon's container which should have a restrictive seccomp profile. Document this requirement:

# docker-compose.yml (recommended)
services:
  charon:
    security_opt:
      - seccomp:seccomp-profile.json  # Or use default Docker profile
    # Alternative: Use --cap-drop=ALL --cap-add=<minimal>

Note: Full seccomp profile customization is out of scope for this feature. Users relying on script plugins in high-security environments should review container security configuration.


### 9.4 Audit Logging

All custom plugin operations MUST be logged:

```go
type PluginAuditEvent struct {
    Timestamp   time.Time
    PluginType  string // "webhook", "script", "rfc2136", "manual"
    Action      string // "create_record", "delete_record", "verify"
    ProviderID  uint
    Domain      string
    Success     bool
    Duration    time.Duration
    ErrorMsg    string
    Details     map[string]any // Redacted credentials
}

10. Implementation Phases

Phase 1: Manual Plugin (Week 1)

Task Hours Owner
ManualProvider implementation 4 Backend
Manual challenge data model 2 Backend
Challenge verification endpoint 3 Backend
Polling endpoint (10s interval) 2 Backend
Manual challenge UI component 6 Frontend
Challenge cleanup scheduled task 2 Backend
Unit tests 4 QA
Integration tests 3 QA
i18n translation keys 2 Frontend
Documentation 2 Docs
Total 32
With 20% buffer 32

Deliverables:

  • backend/pkg/dnsprovider/custom/manual.go
  • backend/internal/services/manual_challenge_service.go
  • frontend/src/components/ManualDNSChallenge.tsx
  • API endpoints for challenge lifecycle (including /poll)
  • Translation keys in frontend/src/locales/*/translation.json:
    • dnsProvider.manual.title
    • dnsProvider.manual.instructions
    • dnsProvider.manual.recordName
    • dnsProvider.manual.recordValue
    • dnsProvider.manual.copyButton
    • dnsProvider.manual.verifyButton
    • dnsProvider.manual.checkDnsButton
    • dnsProvider.manual.timeRemaining
    • dnsProvider.manual.status.pending
    • dnsProvider.manual.status.verified
    • dnsProvider.manual.status.expired
    • dnsProvider.manual.status.failed
    • dnsProvider.manual.errors.*
  • User guide: docs/features/manual-dns-challenge.md

Phase 2: RFC 2136 Plugin (Week 2)

Task Hours Owner
RFC2136Provider implementation 4 Backend
TSIG credential validation 3 Backend
Caddy module integration research 2 Backend
Dockerfile update (xcaddy + rfc2136) 2 DevOps
RFC 2136 form UI 4 Frontend
i18n translation keys 1 Frontend
Unit tests 3 QA
Integration tests (with BIND container) 4 QA
Documentation + BIND setup guide 3 Docs
Total 28
With 20% buffer 28

Deliverables:

  • backend/pkg/dnsprovider/custom/rfc2136.go
  • Caddy config generation for RFC 2136
  • Dockerfile modification:
    # Multi-stage build: Caddy with RFC 2136 module
    FROM caddy:2-builder AS caddy-builder
    RUN xcaddy build \
        --with github.com/caddy-dns/rfc2136
    
    # Copy custom Caddy binary to final image
    COPY --from=caddy-builder /usr/bin/caddy /usr/bin/caddy
    
  • frontend/src/components/RFC2136Form.tsx
  • Translation keys for RFC 2136 provider
  • User guide: docs/features/rfc2136-dns.md
  • BIND9 setup guide: docs/guides/bind9-acme-setup.md

Phase 3: Webhook Plugin (Week 3)

Task Hours Owner
WebhookProvider implementation 5 Backend
HTTP client with retry logic 3 Backend
Rate limiting + circuit breaker 3 Backend
SSRF validation (use existing) 1 Backend
Webhook form UI 4 Frontend
i18n translation keys 1 Frontend
Unit tests 3 QA
Integration tests (mock webhook server) 3 QA
Security tests (SSRF) 2 QA
Example webhook implementations 2 Docs
Documentation 2 Docs
Total 30
With 20% buffer 30

Deliverables:

  • backend/pkg/dnsprovider/custom/webhook.go
  • backend/internal/services/webhook_client.go
  • frontend/src/components/WebhookForm.tsx
  • Translation keys for Webhook provider
  • Example: examples/webhook-server/nodejs/
  • Example: examples/webhook-server/python/
  • User guide: docs/features/webhook-dns.md

Phase 4: Script Plugin (Week 4, Optional)

Task Hours Owner
ScriptProvider implementation 4 Backend
Secure execution sandbox 4 Backend
Security review 3 Security
Script form UI 3 Frontend
Unit tests 3 QA
Security tests 4 QA
Example scripts 2 Docs
Documentation 2 Docs
Total 25

Deliverables:

  • backend/pkg/dnsprovider/custom/script.go
  • backend/internal/services/script_executor.go
  • frontend/src/components/ScriptForm.tsx
  • Example: examples/scripts/nsupdate.sh
  • Example: examples/scripts/cloudns.sh
  • User guide: docs/features/script-dns.md
  • Security guide: docs/guides/script-plugin-security.md

11. Testing Strategy

11.1 Unit Tests

Each provider requires tests for:

  • Credential validation
  • Config generation
  • Error handling
  • Timeout behavior
// backend/pkg/dnsprovider/custom/webhook_test.go
func TestWebhookProvider_ValidateCredentials(t *testing.T) {
    tests := []struct {
        name    string
        creds   map[string]string
        wantErr bool
    }{
        {"valid with auth", map[string]string{"create_url": "https://...", "delete_url": "https://...", "auth_header": "X-Key", "auth_value": "secret"}, false},
        {"valid without auth", map[string]string{"create_url": "https://...", "delete_url": "https://..."}, false},
        {"missing create_url", map[string]string{"delete_url": "https://..."}, true},
        {"http not allowed", map[string]string{"create_url": "http://...", "delete_url": "http://..."}, true},
        {"internal IP blocked", map[string]string{"create_url": "https://192.168.1.1/dns", "delete_url": "https://192.168.1.1/dns"}, true},
    }
    // ...
}

11.2 Integration Tests

Test Scenario Components Method
Manual challenge flow Backend + Frontend E2E with Playwright
RFC 2136 with BIND9 Backend + BIND container Docker Compose
Webhook with mock server Backend + Mock HTTP httptest
Script execution Backend + Test scripts Isolated container

Manual Plugin E2E Scenarios (Playwright)

Scenario Description Expected Result
Countdown timeout User does not create DNS record UI shows "Expired" after timeout, challenge marked expired
Copy buttons User clicks "Copy" for record name/value Values copied to clipboard, toast notification shown
DNS propagation success User creates record, clicks "Verify" After retries, status changes to "Verified"
DNS propagation failure User creates wrong record After max retries, shows "DNS record not found"
Cancel challenge User clicks "Cancel Challenge" Challenge marked as cancelled, UI returns to provider list
Refresh during challenge User refreshes page during pending challenge Challenge state persisted, countdown continues from correct time

11.3 Security Tests

Test Tool Target
SSRF in webhook URLs Custom test suite WebhookProvider
Path traversal in scripts Custom test suite ScriptProvider
Credential leakage in logs Log analysis All providers
TSIG key handling Memory dump analysis RFC2136Provider

11.4 Coverage Requirements

  • Backend: ≥85% coverage
  • Frontend: ≥85% coverage
  • New provider code: ≥90% coverage

12. Documentation Requirements

12.1 User Documentation

Document Audience Location
Custom DNS Providers Overview All users docs/features/custom-dns-providers.md
Manual DNS Challenge Guide Beginners docs/features/manual-dns-challenge.md
RFC 2136 Setup Guide Self-hosted DNS admins docs/features/rfc2136-dns.md
Webhook Integration Guide DevOps teams docs/features/webhook-dns.md
Script Plugin Guide Power users docs/features/script-dns.md

12.2 Technical Documentation

Document Audience Location
Custom Plugin Architecture Contributors docs/development/custom-plugin-architecture.md
Webhook API Specification Integration devs docs/api/webhook-dns-api.md
RFC 2136 Protocol Details Network engineers docs/technical/rfc2136-implementation.md

12.3 Setup Guides

Guide Audience Location
BIND9 ACME Setup Self-hosted users docs/guides/bind9-acme-setup.md
PowerDNS ACME Setup Self-hosted users docs/guides/powerdns-acme-setup.md
Building Webhook Endpoints Developers docs/guides/webhook-development.md

13. Estimated Effort

Summary by Phase

Phase Description Hours Hours (with 20% buffer) Calendar
1 Manual Plugin 27 32 1 week
2 RFC 2136 Plugin 23 28 1 week
3 Webhook Plugin 25 30 1 week
Total (Phases 1-3) Core Features 75 90 3 weeks
4 Script Plugin (Future) 25 30 1 week
Total (All Phases) Including Future 100 120 4 weeks

Note: Phase 4 (Script Plugin) is conditional on community demand (>20 GitHub issues). See "Future Work" section.

Effort by Role

Role Phase 1 Phase 2 Phase 3 Phase 4* Total
Backend 11h 11h 12h 8h 42h
Frontend 8h 5h 5h 3h 21h
QA 7h 7h 8h 7h 29h
Docs 2h 3h 4h 4h 13h
DevOps 0h 2h 0h 0h 2h
Security 0h 0h 1h 3h 4h

*Phase 4 effort is conditional

MVP (Minimum Viable Product)

MVP = Phase 1 (Manual Plugin)

  • Time: 32 hours / 1 week (with buffer)
  • Unblocks: All users with unsupported DNS providers
  • Risk: Low

14. Decisions and Open Questions

Decisions Made

  1. Caddy Module Strategy for RFC 2136

    DECIDED: Option B — RFC 2136 module will be included in Charon's Caddy build.

    Rationale: Best user experience. Users should not need to rebuild Caddy themselves. The Dockerfile will be updated in Phase 2 to use xcaddy with the github.com/caddy-dns/rfc2136 module.

Must Decide Before Implementation

  1. Script Plugin Security Model

    • Should scripts run in a separate container/sandbox?
    • What environment variables should be available?
    • Should we allow network access from scripts?
    • Recommendation: No network by default, minimal env, document risks
  2. Manual Challenge Persistence

    • Store challenge details in database or session?
    • How long to retain completed challenges?
    • Recommendation: Database with 24-hour TTL cleanup (see Section 6.4)
  3. Webhook Retry Strategy

    • Exponential backoff vs. fixed interval?
    • Max retries before failure?
    • Recommendation: Exponential backoff (1s, 2s, 4s), max 3 retries

Nice to Decide

  1. UI Location for Custom Plugins

    • Same page as built-in providers?
    • Separate "Custom Integrations" section?
    • Recommendation: Same page, grouped by category
  2. Telemetry for Custom Plugins

    • Should we track usage of custom plugin types?
    • Privacy considerations?
    • Recommendation: Opt-in anonymous usage stats
  3. Plugin Marketplace (Future)

    • Community-contributed webhook templates?
    • Pre-configured RFC 2136 profiles?
    • Recommendation: Defer to Phase 5+

15. Appendix

B. External References

C. Example Webhook Payload

{
  "action": "create",
  "fqdn": "_acme-challenge.example.com",
  "domain": "example.com",
  "subdomain": "_acme-challenge",
  "value": "gZrH7wL9t3kM2nP4qX5yR8sT0uV1wZ2aB3cD4eF5gH6iJ7kL",
  "ttl": 300,
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-08T15:30:00Z",
  "charon_version": "1.2.0",
  "certificate_domains": ["*.example.com", "example.com"]
}

D. Example BIND9 TSIG Configuration

// /etc/bind/named.conf.local
key "acme-update-key" {
    algorithm hmac-sha256;
    secret "base64-encoded-secret-here==";
};

zone "example.com" {
    type master;
    file "/var/lib/bind/db.example.com";
    update-policy {
        grant acme-update-key name _acme-challenge.example.com. TXT;
    };
};

16. Revision History

Version Date Author Changes
1.0 2026-01-08 Planning Agent Initial specification
1.1 2026-01-08 Planning Agent Supervisor review: addressed 13 issues (see below)

17. Supervisor Review Summary

This specification was revised to address all 13 issues identified during Supervisor review:

Critical Issues (Fixed)

# Issue Resolution
1 SSRF Duplication Section 9.2 updated to reference existing security.ValidateExternalURL() in backend/internal/security/url_validator.go
2 Script Security Insufficient Section 9.3 enhanced with rlimit enforcement, seccomp documentation, minimal PATH, and timeout command
3 Missing Caddy Integration Detail Added Section 3.3.1-3.3.4 with sequence diagram, state machine, error handling, and communication protocol

High Severity Issues (Fixed)

# Issue Resolution
4 RFC 2136 Caddy Module Section 4.3 updated with DECISION; Phase 2 includes Dockerfile deliverable
5 WebSocket vs Polling Section 4.4 updated; chose polling (10s interval) with rationale; polling endpoint added to API
6 Webhook Rate Limiting Section 4.1 updated with rate limits (10/min) and circuit breaker (5 failures → 5 min disable)

Medium Severity Issues (Fixed)

# Issue Resolution
7 Phase 4 Scope Creep Phase 4 moved to "Future Work" section with explicit Go/No-Go gate (>20 GitHub issues)
8 Missing Error Codes Section 7.3 added with comprehensive error code table
9 Time Estimates Buffer Section 13 updated: Phase 1→32h, Phase 2→28h, Phase 3→30h (all +20%)
10 Open Question #1 Section 14 changed to "Decisions and Open Questions"; Option B confirmed as DECIDED

Low Severity Issues (Fixed)

# Issue Resolution
11 i18n Keys Phase 1 deliverables updated with translation keys for frontend/src/locales/*/translation.json
12 E2E Test Scenarios Section 11.2 expanded with Manual Plugin E2E scenarios table
13 Cleanup Mechanism Section 6.4 added with cron-based cleanup using existing robfig/cron/v3 pattern

This document has completed Supervisor review and is ready for technical review and stakeholder approval.