fix: standardize agent names and add Management agent for orchestration
This commit is contained in:
2
.github/agents/Backend_Dev.agent.md
vendored
2
.github/agents/Backend_Dev.agent.md
vendored
@@ -1,4 +1,4 @@
|
||||
name: Backend_Dev
|
||||
name: Backend Dev
|
||||
description: Senior Go Engineer focused on high-performance, secure backend implementation.
|
||||
argument-hint: The specific backend task from the Plan (e.g., "Implement ProxyHost CRUD endpoints")
|
||||
# ADDED 'list_dir' below so Step 1 works
|
||||
|
||||
2
.github/agents/DevOps.agent.md
vendored
2
.github/agents/DevOps.agent.md
vendored
@@ -1,4 +1,4 @@
|
||||
name: Dev_Ops
|
||||
name: Dev Ops
|
||||
description: DevOps specialist that debugs GitHub Actions, CI pipelines, and Docker builds.
|
||||
argument-hint: The workflow issue (e.g., "Why did the last build fail?" or "Fix the Docker push error")
|
||||
tools: ['run_terminal_command', 'read_file', 'write_file', 'search', 'list_dir']
|
||||
|
||||
2
.github/agents/Doc_Writer.agent.md
vendored
2
.github/agents/Doc_Writer.agent.md
vendored
@@ -1,4 +1,4 @@
|
||||
name: Docs_Writer
|
||||
name: Docs Writer
|
||||
description: User Advocate and Writer focused on creating simple, layman-friendly documentation.
|
||||
argument-hint: The feature to document (e.g., "Write the guide for the new Real-Time Logs")
|
||||
tools: ['search', 'read_file', 'write_file', 'list_dir', 'changes']
|
||||
|
||||
2
.github/agents/Frontend_Dev.agent.md
vendored
2
.github/agents/Frontend_Dev.agent.md
vendored
@@ -1,4 +1,4 @@
|
||||
name: Frontend_Dev
|
||||
name: Frontend Dev
|
||||
description: Senior React/UX Engineer focused on seamless user experiences and clean component architecture.
|
||||
argument-hint: The specific frontend task from the Plan (e.g., "Create Proxy Host Form")
|
||||
# ADDED 'list_dir' below so Step 1 works
|
||||
|
||||
197
.github/agents/Managment.agent.md
vendored
Normal file
197
.github/agents/Managment.agent.md
vendored
Normal file
@@ -0,0 +1,197 @@
|
||||
name: Management
|
||||
description: Principal Architect that researches and outlines detailed technical plans for Charon
|
||||
argument-hint: Describe the feature, bug, or goal to plan
|
||||
tools: ['search', 'runSubagent', 'usages', 'problems', 'changes', 'fetch', 'githubRepo', 'read_file', 'list_dir', 'manage_todo_list', 'write_file']
|
||||
|
||||
---
|
||||
You are a PRINCIPAL SOFTWARE ARCHITECT and TECHNICAL PRODUCT MANAGER.
|
||||
|
||||
Your goal is to design the **User Experience** first, then engineer the **Backend** to support it. Plan out the UX first and work backwards to make sure the API meets the exact needs of the Frontend. When you need a subagent to perform a task, use the `#runSubagent` tool. Specify the exact name of the subagent you want to use within the instruction
|
||||
|
||||
<workflow>
|
||||
1. **Context Loading (CRITICAL)**:
|
||||
- Read `.github/copilot-instructions.md`.
|
||||
- **Smart Research**: Run `list_dir` on `internal/models` and `src/api`. ONLY read the specific files relevant to the request. Do not read the entire directory.
|
||||
- **Path Verification**: Verify file existence before referencing them.
|
||||
|
||||
2. **UX-First Gap Analysis**:
|
||||
- **Step 1**: Visualize the user interaction. What data does the user need to see?
|
||||
- **Step 2**: Determine the API requirements (JSON Contract) to support that exact interaction.
|
||||
- **Step 3**: Identify necessary Backend changes.
|
||||
|
||||
3. **Draft & Persist**:
|
||||
- Create a structured plan following the <output_format>.
|
||||
- **Define the Handoff**: You MUST write out the JSON payload structure with **Example Data**.
|
||||
- **SAVE THE PLAN**: Write the final plan to `docs/plans/current_spec.md` (Create the directory if needed). This allows Dev agents to read it later.
|
||||
|
||||
4. **Review**:
|
||||
- Ask the user for confirmation.
|
||||
|
||||
5. **Implementation**:
|
||||
- use the `#runSubagent` tool to start an Implementation agent with the saved plan.
|
||||
- For backend work, use `#runSubagent` Backend Dev
|
||||
- For Frontend work, use `#runSubagent` Frontend Dev
|
||||
- For DevOps work, use `#runSubagent` DevOps
|
||||
- For QA and Security work, use `#runSubagent` QA and Security
|
||||
- For Documentation work, use `#runSubagent` Doc Writer
|
||||
|
||||
### Subagents (Available)
|
||||
Use the following agent names when calling `runSubagent` from Management. The `description` field must match exactly:
|
||||
|
||||
- Planning: `Planning` (file: .github/agents/Planning.agent.md)
|
||||
- Backend: `Backend Dev` (file: .github/agents/Backend_Dev.agent.md)
|
||||
- Frontend: `Frontend Dev` (file: .github/agents/Frontend_Dev.agent.md)
|
||||
- QA & Security: `QA and Security` (file: .github/agents/QA_Security.agent.md)
|
||||
- DevOps: `DevOps` (file: .github/agents/DevOps.agent.md)
|
||||
- Doc Writer: `Doc Writer` (file: .github/agents/Doc_Writer.agent.md)
|
||||
|
||||
When you reference the agents in a `runSubagent` call, set `description` exactly as listed above and provide a `plan_file` if available.
|
||||
|
||||
</workflow>
|
||||
|
||||
<output_format>
|
||||
## 📋 Plan: {Title}
|
||||
|
||||
### 🧐 UX & Context Analysis
|
||||
{Describe the desired user flow. e.g., "User clicks 'Scan', sees a spinner, then a live list of results."}
|
||||
|
||||
### 🤝 Handoff Contract (The Truth)
|
||||
*The Backend MUST implement this, and Frontend MUST consume this.*
|
||||
```json
|
||||
// POST /api/v1/resource
|
||||
{
|
||||
"request_payload": { "example": "data" },
|
||||
"response_success": {
|
||||
"id": "uuid",
|
||||
"status": "pending"
|
||||
}
|
||||
}
|
||||
```
|
||||
### 🏗️ Phase 1: Backend Implementation (Go)
|
||||
1. Models: {Changes to internal/models}
|
||||
2. API: {Routes in internal/api/routes}
|
||||
3. Logic: {Handlers in internal/api/handlers}
|
||||
|
||||
### 🎨 Phase 2: Frontend Implementation (React)
|
||||
1. Client: {Update src/api/client.ts}
|
||||
2. UI: {Components in src/components}
|
||||
3. Tests: {Unit tests to verify UX states}
|
||||
|
||||
### 🕵️ Phase 3: QA & Security
|
||||
1. Edge Cases: {List specific scenarios to test}
|
||||
|
||||
### 📚 Phase 4: Documentation
|
||||
1. Files: Update docs/features.md.
|
||||
|
||||
</output_format>
|
||||
|
||||
<constraints>
|
||||
|
||||
- NO HALLUCINATIONS: Do not guess file paths. Verify them.
|
||||
|
||||
- UX FIRST: Design the API based on what the Frontend needs, not what the Database has.
|
||||
|
||||
- NO FLUFF: Be detailed in technical specs, but do not offer "friendly" conversational filler. Get straight to the plan.
|
||||
|
||||
- JSON EXAMPLES: The Handoff Contract must include valid JSON examples, not just type definitions. </constraints>
|
||||
|
||||
### Orchestration Patterns (Management)
|
||||
|
||||
Use these patterns to coordinate multiple `runSubagent` calls into a single flow. Each run should pass a plan file and required metadata, and each subagent must return a structured result. Management should validate results and orchestrate retries or rollbacks when necessary.
|
||||
|
||||
- Sequential Pattern: Use when subagent tasks are dependent. Example flow: Planning -> Backend_Dev -> Frontend_Dev -> QA_Security -> DevOps -> Doc_Writer.
|
||||
- Parallel Pattern: Execute unrelated tasks in parallel (e.g., UI refactor + docs update) and then run a final QA step.
|
||||
- Retry Policy: On failure, a subagent should attempt 1 retry unless explicitly allowed more; after a retry the subagent must return a clear failure reason and artifacts.
|
||||
- Rollback & Cleanup: Management should control rollback policies. Subagents should provide a revert strategy if expected (branch to revert, PR, or explicit commands).
|
||||
|
||||
### Example: Sequential Orchestration (Feature Implementation)
|
||||
|
||||
Purpose: Implement an aggregated `host_statuses` endpoint and a new dashboard widget.
|
||||
|
||||
1) Launch the `Planning` subagent to produce a plan file and the contract:
|
||||
```
|
||||
runSubagent({
|
||||
prompt: "Create a plan for `host_statuses` endpoint and status widget. Save to docs/plans/current_spec.md and return the plan file path.",
|
||||
description: "Planning",
|
||||
metadata: {
|
||||
plan_file: "docs/plans/current_spec.md",
|
||||
acceptance_criteria: ["Plan has HandOff JSON", "API Contract defined", "Frontend component requirements defined"],
|
||||
timeout_minutes: 30
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
2) On success, invoke `Backend_Dev`:
|
||||
```
|
||||
runSubagent({
|
||||
prompt: "Implement backend endpoint per docs/plans/current_spec.md. Add models, handler, route and tests. Run `cd backend && go test ./...` and return changed_files and test output.",
|
||||
description: "Backend Dev",
|
||||
metadata: {
|
||||
plan_file: "docs/plans/current_spec.md",
|
||||
commands_to_run: ["cd backend && go test ./..."],
|
||||
acceptance_criteria: ["All tests pass", "No lint errors"]
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
3) On backend success, invoke `Frontend_Dev`:
|
||||
```
|
||||
runSubagent({
|
||||
prompt: "Implement frontend widget and hook per docs/plans/current_spec.md. Run `cd frontend && npm run build` and vitest. Return changed files and tests.",
|
||||
description: "Frontend Dev",
|
||||
metadata: {
|
||||
plan_file: "docs/plans/current_spec.md",
|
||||
commands_to_run: ["cd frontend && npm run type-check && npm run build"],
|
||||
acceptance_criteria: ["Frontend builds", "Type check passes", "New unit tests added"]
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
4) Run `QA_Security` and `DevOps` in parallel with `Doc_Writer` post implementation for CI updates and docs.
|
||||
|
||||
5) Aggregate outputs and finalize a release summary.
|
||||
|
||||
### Management Return Object
|
||||
|
||||
Management agents should return a consistent summary of the orchestration:
|
||||
```
|
||||
{
|
||||
"plan_file": "docs/plans/current_spec.md",
|
||||
"subagent_results": [
|
||||
{"agent":"Planning","status":"success","changed_files":[],"artifacts":["docs/plans/current_spec.md"]},
|
||||
{"agent":"Backend_Dev","status":"success","changed_files":["backend/internal/models/hosts.go"],"artifacts":[]}
|
||||
],
|
||||
"overall_status": "success",
|
||||
"artifacts": ["docs/plans/current_spec.md","release_notes.md"]
|
||||
}
|
||||
```
|
||||
|
||||
### Post-Mortem / Rollback
|
||||
|
||||
If orchestration fails after retries, Management should:
|
||||
- Request the failed subagent to produce a revert strategy and failing tests output.
|
||||
- If the revert strategy is missing, Management must create an emergency revert PR or open an issue for the maintainers.
|
||||
|
||||
### Orchestration Pseudo-Code (Management)
|
||||
|
||||
This pseudo-code shows an example orchestration flow for the Management agent. It demonstrates sequential runSubagent calls, one retry on failure and a simple aggregation of results.
|
||||
|
||||
```
|
||||
async function orchestrate(planFile) {
|
||||
const agents = ["Planning","Backend Dev","Frontend Dev","QA and Security","DevOps","Doc Writer"]
|
||||
const results = []
|
||||
|
||||
for (const agent of agents) {
|
||||
const payload = { plan_file: planFile }
|
||||
let response = await runSubagent({ description: agent, prompt: `Run ${agent} for ${planFile}`, metadata: payload })
|
||||
results.push({ agent, response })
|
||||
if (!response.success) {
|
||||
// Retry once for transient failures
|
||||
const retry = await runSubagent({ description: agent, prompt: `Retry ${agent}`, metadata: payload })
|
||||
results.push({ agent: `${agent}-retry`, response: retry })
|
||||
if (!retry.success) { return { overall_status: "failed", results } }
|
||||
}
|
||||
}
|
||||
return { overall_status: "success", results }
|
||||
}
|
||||
```
|
||||
3
.github/agents/Planning.agent.md
vendored
3
.github/agents/Planning.agent.md
vendored
@@ -6,7 +6,7 @@ tools: ['search', 'runSubagent', 'usages', 'problems', 'changes', 'fetch', 'gith
|
||||
---
|
||||
You are a PRINCIPAL SOFTWARE ARCHITECT and TECHNICAL PRODUCT MANAGER.
|
||||
|
||||
Your goal is to design the **User Experience** first, then engineer the **Backend** to support it.
|
||||
Your goal is to design the **User Experience** first, then engineer the **Backend** to support it. Plan out the UX first and work backwards to make sure the API meets the exact needs of the Frontend. When you need a subagent to perform a task, use the `#runSubagent` tool. Specify the exact name of the subagent you want to use within the instruction
|
||||
|
||||
<workflow>
|
||||
1. **Context Loading (CRITICAL)**:
|
||||
@@ -26,6 +26,7 @@ Your goal is to design the **User Experience** first, then engineer the **Backen
|
||||
|
||||
4. **Review**:
|
||||
- Ask the user for confirmation.
|
||||
|
||||
</workflow>
|
||||
|
||||
<output_format>
|
||||
|
||||
2
.github/agents/QA_Security.agent.md
vendored
2
.github/agents/QA_Security.agent.md
vendored
@@ -1,4 +1,4 @@
|
||||
name: QA_Security
|
||||
name: QA and Security
|
||||
description: Security Engineer and QA specialist focused on breaking the implementation.
|
||||
argument-hint: The feature or endpoint to audit (e.g., "Audit the new Proxy Host creation flow")
|
||||
tools: ['search', 'runSubagent', 'read_file', 'run_terminal_command', 'usages', 'write_file', 'list_dir', 'run_task']
|
||||
|
||||
60
.github/agents/SubagentUsage.md
vendored
Normal file
60
.github/agents/SubagentUsage.md
vendored
Normal file
@@ -0,0 +1,60 @@
|
||||
## Subagent Usage Templates and Orchestration
|
||||
|
||||
This helper provides the Management agent with templates to create robust and repeatable `runSubagent` calls.
|
||||
|
||||
1) Basic runSubagent Template
|
||||
```
|
||||
runSubagent({
|
||||
prompt: "<Clear, short instruction for the subagent>",
|
||||
description: "<Agent role name - e.g., Backend Dev>",
|
||||
metadata: {
|
||||
plan_file: "docs/plans/current_spec.md",
|
||||
files_to_change: ["..."],
|
||||
commands_to_run: ["..."],
|
||||
tests_to_run: ["..."],
|
||||
timeout_minutes: 60,
|
||||
acceptance_criteria: ["All tests pass", "No lint warnings"]
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
2) Orchestration Checklist (Management)
|
||||
- Validate: `plan_file` exists and contains a `Handoff Contract` JSON.
|
||||
- Kickoff: call `Planning` to create the plan if not present.
|
||||
- Run: execute `Backend Dev` then `Frontend Dev` sequentially.
|
||||
- Parallel: run `QA and Security`, `DevOps` and `Doc Writer` in parallel for CI / QA checks and documentation.
|
||||
- Return: a JSON summary with `subagent_results`, `overall_status`, and aggregated artifacts.
|
||||
|
||||
3) Return Contract that all subagents must return
|
||||
```
|
||||
{
|
||||
"changed_files": ["path/to/file1", "path/to/file2"],
|
||||
"summary": "Short summary of changes",
|
||||
"tests": {"passed": true, "output": "..."},
|
||||
"artifacts": ["..."],
|
||||
"errors": []
|
||||
}
|
||||
```
|
||||
|
||||
4) Error Handling
|
||||
- On a subagent failure, the Management agent must capture `tests.output` and decide to retry (1 retry maximum), or request a revert/rollback.
|
||||
- Clearly mark the `status` as `failed`, and include `errors` and `failing_tests` in the `summary`.
|
||||
|
||||
5) Example: Run a full Feature Implementation
|
||||
```
|
||||
// 1. Planning
|
||||
runSubagent({ description: "Planning", prompt: "<generate plan>", metadata: { plan_file: "docs/plans/current_spec.md" } })
|
||||
|
||||
// 2. Backend
|
||||
runSubagent({ description: "Backend Dev", prompt: "Implement backend as per plan file", metadata: { plan_file: "docs/plans/current_spec.md", commands_to_run: ["cd backend && go test ./..."] } })
|
||||
|
||||
// 3. Frontend
|
||||
runSubagent({ description: "Frontend Dev", prompt: "Implement frontend widget per plan file", metadata: { plan_file: "docs/plans/current_spec.md", commands_to_run: ["cd frontend && npm run build"] } })
|
||||
|
||||
// 4. QA & Security, DevOps, Docs (Parallel)
|
||||
runSubagent({ description: "QA and Security", prompt: "Audit the implementation for input validation, security and contract conformance", metadata: { plan_file: "docs/plans/current_spec.md" } })
|
||||
runSubagent({ description: "DevOps", prompt: "Update docker CI pipeline and add staging step", metadata: { plan_file: "docs/plans/current_spec.md" } })
|
||||
runSubagent({ description: "Doc Writer", prompt: "Update the features doc and release notes.", metadata: { plan_file: "docs/plans/current_spec.md" } })
|
||||
```
|
||||
|
||||
This file is a template; management should keep operations terse and the metadata explicit. Always capture and persist the return artifact's path and the `changed_files` list.
|
||||
@@ -1,98 +1,216 @@
|
||||
## 📋 Plan: Security Hardening, User Gateway & Identity
|
||||
<!--
|
||||
This file is a placeholder for the current plan. The `Planning` agent must write the detailed plan here (see docs/plans/sample_orchestration_plan.md for a sample).
|
||||
Subagents will read this file as the single source of truth for the feature implementation.
|
||||
-->
|
||||
|
||||
### 🧐 UX & Context Analysis
|
||||
<!--
|
||||
CURRENT SPEC: Aggregated Host Statuses (Uptime) — Endpoint + Dashboard Widget
|
||||
- Replace this file with the feature spec and Handoff JSON contract for implementing
|
||||
'Aggregated Host Statuses': an API endpoint grouping uptime monitors by host and
|
||||
a dashboard widget that shows aggregated host-level health and quick drill-down.
|
||||
- This document should be used as the single source of truth for developers and handoff.
|
||||
-->
|
||||
|
||||
This plan expands on the initial security hardening to include a full **Identity Provider (IdP)** feature set. This allows Charon to manage users, invite them via email, and let them log in using external providers (SSO), while providing seamless access to downstream apps.
|
||||
# Current Plan: Aggregated Host Statuses
|
||||
|
||||
#### 1. The User Gateway (Forward Auth)
|
||||
* **Scenario:** Admin shares `jellyseerr.example.com` with a friend.
|
||||
* **Flow:**
|
||||
1. Friend visits `jellyseerr.example.com`.
|
||||
2. Redirected to Charon Login.
|
||||
3. Logs in via **Plex / Google / GitHub** OR Local Account.
|
||||
4. Charon verifies access.
|
||||
5. Charon redirects back to Jellyseerr, injecting `X-Forwarded-User: friend@email.com`.
|
||||
6. **Magic:** Jellyseerr (configured for header auth) sees the header and logs the friend in automatically. **No second login.**
|
||||
This feature adds a backend endpoint that returns aggregated health information for upstream hosts
|
||||
and a frontend Dashboard widget to display the aggregated view. The goal is to provide host-level
|
||||
health at-a-glance to help identify server-wide outages and quickly navigate to affected services.
|
||||
|
||||
#### 2. User Onboarding (SMTP & Invites)
|
||||
* **Problem:** Admin shouldn't set passwords manually.
|
||||
* **Solution:** Admin enters email -> Charon sends Invite Link -> User clicks link -> User sets Password & Name.
|
||||
## Summary
|
||||
- Endpoint: `GET /api/v1/uptime/hosts/aggregated` (authenticated)
|
||||
- Backend: Service method + handler + route + GORM query, small in-memory cache, server-side filters
|
||||
- Frontend: API client, custom React Query hook, `HostStatusesWidget` in Dashboard, demo/test pages
|
||||
- Acceptance: Auth respects accessible hosts, accurate counts, performance (fast aggregate queries)
|
||||
|
||||
#### 3. User-Centric Permissions (Allow/Block Lists)
|
||||
* **Concept:** Instead of managing groups, Admin manages permissions *per user*.
|
||||
* **UX:**
|
||||
* Go to **Users** -> Edit User -> **Permissions** Tab.
|
||||
* **Mode:** Toggle between **"Allow All (Blacklist)"** or **"Deny All (Whitelist)"**.
|
||||
* **Exceptions:** Multi-select list of Proxy Hosts.
|
||||
* *Example:* Set Mode to "Deny All", select "Jellyseerr". User can ONLY access Jellyseerr.
|
||||
* *Example:* Set Mode to "Allow All", select "Home Assistant". User can access everything EXCEPT Home Assistant.
|
||||
## HandOff JSON contract (Truth)
|
||||
Request: `GET /api/v1/uptime/hosts/aggregated`
|
||||
- Query Params (optional):
|
||||
- `status` (string): filter results by host status: up|down|pending|maintenance
|
||||
- `q` (string): search text (host or name)
|
||||
- `sort_by` (string): `monitor_count|down_count|avg_latency|last_check` (default: `down_count`)
|
||||
- `order` (string): `asc|desc` (default: `desc`)
|
||||
- `page` (int): pagination page (default 1)
|
||||
- `per_page` (int): items per page (default 50)
|
||||
|
||||
### 🤝 Handoff Contract (The Truth)
|
||||
|
||||
#### 1. Auth Verification (Internal API for Caddy)
|
||||
* **Endpoint:** `GET /api/auth/verify`
|
||||
* **Response Headers:**
|
||||
* `X-Forwarded-User`: The user's email or username.
|
||||
* `X-Forwarded-Groups`: (Future) User roles/groups.
|
||||
|
||||
#### 2. SMTP Configuration
|
||||
Response: 200 JSON
|
||||
```json
|
||||
// POST /api/settings/smtp
|
||||
{
|
||||
"host": "smtp.gmail.com",
|
||||
"port": 587,
|
||||
"username": "admin@example.com",
|
||||
"password": "app-password",
|
||||
"from_address": "Charon <no-reply@example.com>",
|
||||
"encryption": "starttls" // none, ssl, starttls
|
||||
"aggregated_hosts": [
|
||||
{
|
||||
"id": "uuid",
|
||||
"host": "10.0.0.12",
|
||||
"name": "web-01",
|
||||
"status": "down",
|
||||
"monitor_count": 3,
|
||||
"counts": { "up": 1, "down": 2, "pending": 0, "maintenance": 0 },
|
||||
"avg_latency_ms": 257,
|
||||
"last_check": "2025-12-05T09:54:54Z",
|
||||
"last_status_change": "2025-12-05T09:53:44Z",
|
||||
"affected_monitors": [
|
||||
{ "id": "mon-1", "name": "example-api", "status": "down", "last_check": "2025-12-05T09:54:54Z" },
|
||||
{ "id": "mon-2", "name": "webapp", "status": "down", "last_check": "2025-12-05T09:52:14Z" }
|
||||
],
|
||||
"uptime_24h": 99.3
|
||||
}
|
||||
],
|
||||
"meta": { "page": 1, "per_page": 50, "total": 1 }
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. User Permissions
|
||||
Notes:
|
||||
- All timestamps are ISO 8601 UTC.
|
||||
- Field names use snake_case (server -> frontend contract per project guidelines).
|
||||
- Only accessible hosts are returned to the authenticated caller (utilize existing auth handlers).
|
||||
|
||||
## Backend Requirements
|
||||
1. Database
|
||||
- Ensure index on `uptime_monitors(uptime_host_id)`, `uptime_monitors(status)`, and `uptime_monitors(last_check)`.
|
||||
- No model changes required for `UptimeHost` or `UptimeMonitor` unless we want an `avg_latency` column cached (optional).
|
||||
|
||||
2. Service (in `internal/services/uptime_service.go`)
|
||||
- Add method: `GetAggregatedHostStatuses(filters AggregationFilter) ([]AggregatedHost, error)`.
|
||||
- Implementation detail:
|
||||
- Query should join `uptime_hosts` and `uptime_monitors` and run a `GROUP BY uptime_host_id`.
|
||||
- Use a SELECT that computes: monitor_count, up_count, down_count, pending_count, maintenance_count, avg_latency, last_check (MAX), last_status_change (MAX).
|
||||
- Provide a parameter to include a limited list of affected monitors (eg. top N by last_check) and optional `uptime_24h` calculation where a heartbeat history exists.
|
||||
- Return GORM structs matching the `AggregatedHost` DTO.
|
||||
|
||||
3. Handler (in `internal/api/handlers/uptime_handler.go`)
|
||||
- Add `func (h *UptimeHandler) AggregatedHosts(c *gin.Context)` that:
|
||||
- Binds query params; validates and normalizes them.
|
||||
- Calls `service.GetAggregatedHostStatuses(filters)`.
|
||||
- Filters the results using `authMiddleware` (maintain accessible hosts list or `authHandler.GetAccessibleHosts` logic).
|
||||
- Caches the result for `CHARON_UPTIME_AGGREGATION_TTL` (default 30s). Cache strategy: package global in `services` with simple `sync.Map` + TTL.
|
||||
- Produces a 200 JSON with the contract above.
|
||||
- Add unit tests and integration tests verifying results and auth scoping.
|
||||
|
||||
4. Routes
|
||||
- Register under protected group in `internal/api/routes/routes.go`:
|
||||
- `protected.GET('/uptime/hosts/aggregated', uptimeHandler.AggregatedHosts)`
|
||||
|
||||
5. Observability
|
||||
- Add a Prometheus counter/metric: `charon_uptime_aggregated_requests_total` (labels: status, cache_hit true/false).
|
||||
- Add logs for aggregation errors.
|
||||
|
||||
6. Security
|
||||
- Ensure only authenticated users can access aggregated endpoint.
|
||||
- Respect `authHandler.GetAccessibleHosts` (or similar) to filter hosts the user should see.
|
||||
|
||||
7. Tests
|
||||
- Unit tests for service logic calculating aggregates (mock DB / in-memory DB fixtures).
|
||||
- Handler integration tests using the testdb and router that verify JSON response structure, pagination, filters, and auth filtering.
|
||||
- Perf tests: basic benchmark to ensure aggregation query completes within acceptable time for 10k monitors (e.g. < 200ms unless run on dev env; document specifics).
|
||||
|
||||
## Frontend Requirements
|
||||
1. API client changes (`frontend/src/api/uptime.ts`)
|
||||
- Add `export const getAggregatedHosts = async (params?: AggregationQueryParams) => client.get<AggregatedHost[]>('/uptime/hosts/aggregated', { params }).then(r => r.data)`
|
||||
- Add new TypeScript types for `AggregatedHost`, `AggregatedHostCounts`, `AffectedMonitor`.
|
||||
|
||||
2. React Query Hook (`frontend/src/hooks/useAggregatedHosts.ts`)
|
||||
- `useAggregatedHosts` should accept params similar to query params (filters), and accept `enabled` flag.
|
||||
- Use TanStack Query with `refetchInterval: 30_000` and `staleTime: 30_000` to match backend TTL.
|
||||
|
||||
3. Dashboard Widget (`frontend/src/components/Dashboard/HostStatusesWidget.tsx`)
|
||||
- Shows high-level summary: total hosts, down_count, up_count, pending.
|
||||
- Clickable host rows navigate to the uptime or host detail page.
|
||||
- Visuals: small status badge, host name, counts, avg latency, last check time.
|
||||
- Accessible: all interactive elements keyboard and screen-reader navigable.
|
||||
- Fallback: if the aggregated endpoint is not found or returns 403, display a short explanatory message with a link to uptime page.
|
||||
|
||||
4. Dashboard Page Update (`frontend/src/pages/Dashboard.tsx`)
|
||||
- Add `HostStatusesWidget` to the Dashboard layout (prefer 2nd column near `UptimeWidget`).
|
||||
|
||||
5. Tests
|
||||
- Unit tests for `HostStatusesWidget` rendering different states.
|
||||
- Mock API responses for `useAggregatedHosts` using the existing test utilities.
|
||||
- Add Storybook story if used in repo (optional).
|
||||
|
||||
6. Styling
|
||||
- Keep styling consistent with `UptimeWidget` (dark-card, status badges, mini bars).
|
||||
|
||||
## Acceptance Criteria
|
||||
1. API
|
||||
- `GET /api/v1/uptime/hosts/aggregated` returns aggregated host objects in the correct format.
|
||||
- Query params `status`, `q`, `sort_by`, `order`, `page`, `per_page` work as expected.
|
||||
- The endpoint respects user-specific host access permissions.
|
||||
- Endpoint adheres to TTL caching; cache invalidation occurs after TTL or when underlying monitor status change triggers invalidation.
|
||||
|
||||
2. Backend Tests
|
||||
- Unit tests cover all aggregation branches and logic (e.g. zero-monitor host, mixed statuses, all down host).
|
||||
- Integration tests validate auth-scoped responses.
|
||||
|
||||
3. Frontend UI
|
||||
- Widget displays host-level counts and shows a list of top N hosts with status badges.
|
||||
- Clicking a host navigates to the uptime or host detail page.
|
||||
- Widget refreshes according to TTL and reacts to manual refreshes.
|
||||
- UI has automated tests covering rendering with typical API responses, filtering and pagination UI behavior.
|
||||
|
||||
4. Performance
|
||||
- Aggregation query responds within acceptable time for typical deployments (document target; e.g. < 200ms for 5k monitors), or we add a follow-up plan to add precomputation.
|
||||
|
||||
## Example API Contract (Sample Request + Response)
|
||||
Request:
|
||||
```http
|
||||
GET /api/v1/uptime/hosts/aggregated?sort_by=down_count&order=desc&page=1&per_page=20
|
||||
Authorization: Bearer <token>
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
// POST /api/users
|
||||
{
|
||||
"email": "friend@example.com",
|
||||
"role": "user",
|
||||
"permission_mode": "deny_all", // or "allow_all"
|
||||
"permitted_hosts": [1, 4, 5] // List of ProxyHost IDs to treat as exceptions
|
||||
"aggregated_hosts": [
|
||||
{
|
||||
"id": "39b6f7c2-2a5c-47d7-9c9d-1d7f1977dabc",
|
||||
"host": "10.0.10.12",
|
||||
"name": "production-web-1",
|
||||
"status": "down",
|
||||
"monitor_count": 3,
|
||||
"counts": {"up": 1, "down": 2, "pending": 0, "maintenance": 0},
|
||||
"avg_latency_ms": 257,
|
||||
"last_check": "2025-12-05T09:54:54Z",
|
||||
"last_status_change": "2025-12-05T09:53:44Z",
|
||||
"affected_monitors": [
|
||||
{"id":"m-01","name":"api.example","status":"down","last_check":"2025-12-05T09:54:54Z","latency":105},
|
||||
{"id":"m-02","name":"www.example","status":"down","last_check":"2025-12-05T09:52:14Z","latency":401}
|
||||
],
|
||||
"uptime_24h": 98.77
|
||||
}
|
||||
],
|
||||
"meta": {"page":1,"per_page":20,"total":1}
|
||||
}
|
||||
```
|
||||
|
||||
### 🏗️ Phase 1: Security Hardening (Quick Wins)
|
||||
1. **Secure Headers:** `Content-Security-Policy`, `Strict-Transport-Security`, `X-Frame-Options`.
|
||||
2. **Cookie Security:** `HttpOnly`, `Secure`, `SameSite=Strict`.
|
||||
## Error cases
|
||||
- 401 Unauthorized — Invalid or missing token.
|
||||
- 403 Forbidden — Caller lacks host access.
|
||||
- 500 Internal Server Error — DB / aggregation error.
|
||||
|
||||
### 🏗️ Phase 2: Backend Core (User & SMTP)
|
||||
1. **Models:**
|
||||
* `User`: Add `InviteToken`, `InviteExpires`, `PermissionMode` (string), `Permissions` (Many-to-Many with ProxyHost).
|
||||
* `ProxyHost`: Add `ForwardAuthEnabled` (bool).
|
||||
* `Setting`: Add keys for `smtp_host`, `smtp_port`, etc.
|
||||
2. **Logic:**
|
||||
* `internal/services/mail`: Implement SMTP sender.
|
||||
* `internal/api/handlers/user.go`: Add `InviteUser` handler and Permission logic.
|
||||
## Observability & Operational Notes
|
||||
- Metrics: `charon_uptime_aggregated_requests_total`, `charon_uptime_aggregated_cache_hits_total`.
|
||||
- Cache TTL: default 30s via `CHARON_UPTIME_AGGREGATION_TTL` env var.
|
||||
- Logging: Rate-limited errors and aggregation durations logged to the general logger.
|
||||
|
||||
### 🏗️ Phase 3: SSO Implementation
|
||||
1. **Library:** Use `github.com/markbates/goth` or `golang.org/x/oauth2`.
|
||||
2. **Models:** `SocialAccount` (UserID, Provider, ProviderID, Email).
|
||||
3. **Routes:**
|
||||
* `GET /auth/:provider`: Start OAuth flow.
|
||||
* `GET /auth/:provider/callback`: Handle return, create/link user, set session.
|
||||
## Follow-ups & Optional Enhancements
|
||||
1. Add an endpoint-level `since` parameter that returns delta/trend information (e.g. change in down_count in last 24 hours).
|
||||
2. Background precompute task (materialized aggregated table) for very large installations.
|
||||
3. Add a configuration to show `affected_monitors` collapsed/expanded per host for faster page loads.
|
||||
|
||||
### 🏗️ Phase 4: Forward Auth Integration
|
||||
1. **Caddy:** Configure `forward_auth` directive to point to Charon API.
|
||||
2. **Logic:** `VerifyAccess` handler:
|
||||
* Check if User is logged in.
|
||||
* Fetch User's `PermissionMode` and `Permissions`.
|
||||
* If `allow_all`: Grant access UNLESS host is in `Permissions`.
|
||||
* If `deny_all`: Deny access UNLESS host is in `Permissions`.
|
||||
## Short List of Files To Change
|
||||
- Backend:
|
||||
- backend/internal/services/uptime_service.go (add aggregation method)
|
||||
- backend/internal/api/handlers/uptime_handler.go (add handler method)
|
||||
- backend/internal/api/routes/routes.go (register new route)
|
||||
- backend/internal/services/uptime_service_test.go (add tests)
|
||||
- backend/internal/api/handlers/uptime_handler_test.go (add handler tests)
|
||||
- backend/internal/models/uptime.go / uptime_host.go (index recommendations or small schema updates if needed)
|
||||
|
||||
### 🎨 Phase 5: Frontend Implementation
|
||||
1. **Settings:** New "SMTP" and "SSO" tabs in Settings page.
|
||||
2. **User List:** "Invite User" button.
|
||||
3. **User Edit:** New "Permissions" tab with "Allow/Block" toggle and Host selector.
|
||||
4. **Login Page:** Add "Sign in with Google/Plex/GitHub" buttons.
|
||||
- Frontend:
|
||||
- frontend/src/api/uptime.ts (add `getAggregatedHosts`)
|
||||
- frontend/src/hooks/useAggregatedHosts.ts (new hook)
|
||||
- frontend/src/components/Dashboard/HostStatusesWidget.tsx (new widget)
|
||||
- frontend/src/pages/Dashboard.tsx (add widget)
|
||||
- frontend/src/components/__tests__/HostStatusesWidget.test.tsx (new tests)
|
||||
|
||||
### 📚 Phase 6: Documentation
|
||||
1. **SSO Guides:** How to get Client IDs from Google/GitHub.
|
||||
2. **Header Auth:** Guide on configuring Jellyseerr/Grafana to trust Charon.
|
||||
---
|
||||
If you want, I can now scaffold the backend service method + handler and the frontend API client and widget as a follow-up PR.
|
||||
|
||||
44
docs/plans/sample_orchestration_plan.md
Normal file
44
docs/plans/sample_orchestration_plan.md
Normal file
@@ -0,0 +1,44 @@
|
||||
<!--
|
||||
Sample Orchestration Plan used by the Management agent when invoking subagents.
|
||||
Keep this file small and precise. Subagents will read the file and act according to the Handoff Contract.
|
||||
-->
|
||||
|
||||
# Plan: Aggregated Host Statuses Endpoint + Dashboard Widget
|
||||
|
||||
## 1) Title
|
||||
Implement `/api/v1/host_statuses` backend endpoint and the `CharonStatusWidget` frontend component.
|
||||
|
||||
## 2) Overview
|
||||
This feature provides an aggregated view of the number of proxy hosts and the number of hosts that are up/down. The backend exposes an endpoint returning aggregated counts, and the frontend consumes the endpoint and presents a dashboard widget.
|
||||
|
||||
## 3) Handoff Contract (Example)
|
||||
**GET** /api/v1/stats/host_statuses
|
||||
|
||||
Response (200):
|
||||
```json
|
||||
{
|
||||
"total_proxy_hosts": 12,
|
||||
"hosts_up": 10,
|
||||
"hosts_down": 2
|
||||
}
|
||||
```
|
||||
|
||||
## 4) Backend Requirements
|
||||
- Add a new read-only route `GET /api/v1/stats/host_statuses` under `internal/api/handlers/`.
|
||||
- Implement the handler to use existing models/services and return the aggregated counts in JSON.
|
||||
- Add unit tests under `backend/internal/services` and the handler's folder.
|
||||
|
||||
## 5) Frontend Requirements
|
||||
- Add `frontend/src/components/CharonStatusWidget.tsx` to render the widget using the endpoint or existing monitors if no endpoint is present.
|
||||
- Add a hook and update the API client if necessary: `frontend/src/api/stats.ts` with `getHostStatuses()`.
|
||||
- Add unit tests: vitest for the component and the hook.
|
||||
|
||||
## 6) Acceptance Criteria
|
||||
- Backend: `go test ./...` passes.
|
||||
- Frontend: `npm run type-check` and `npm run build` pass.
|
||||
- All unit tests pass and new coverage for added code is included.
|
||||
|
||||
## 7) Artifacts
|
||||
- `docs/plans/current_spec.md` (the plan file)
|
||||
- `backend` changed files including handler and tests
|
||||
- `frontend` changed files including component and tests
|
||||
Reference in New Issue
Block a user