The Status() handler was only checking if the CrowdSec process was running, not if LAPI was actually responding. This caused the CrowdSecConfig page to always show "LAPI is initializing" even when LAPI was fully operational. Changes: - Backend: Add lapi_ready field to /admin/crowdsec/status response - Frontend: Add CrowdSecStatus TypeScript interface - Frontend: Update conditional logic to check lapi_ready not running - Frontend: Separate warnings for "initializing" vs "not running" - Tests: Add unit tests for Status handler LAPI check Fixes regression from crowdsec_lapi_error_diagnostic.md fixes.
19 KiB
CrowdSec LAPI Status Bug - Diagnostic & Fix Plan
Date: December 14, 2025 Issue: CrowdSecConfig page persistently shows "LAPI is initializing" even when LAPI is running Status: 🎯 ROOT CAUSE IDENTIFIED - Status endpoint checks process, not LAPI connectivity Priority: HIGH (Blocks Console Enrollment Feature) Previous Issue: crowdsec_lapi_error_diagnostic.md - Race condition fix introduced this regression
🎯 Key Findings
Critical Discovery
After implementing fixes from docs/plans/crowdsec_lapi_error_diagnostic.md, the CrowdSecConfig page now persistently displays:
"CrowdSec Local API is initializing... The CrowdSec process is running but the Local API (LAPI) is still starting up."
This message appears even when LAPI is actually running and reachable. The fix introduced a regression where the Status endpoint was not updated to match the new LAPI-aware Start endpoint.
Root Cause Chain
Start()handler was correctly updated to wait for LAPI and returnlapi_ready: true/false- BUT
Status()handler was NOT updated - still only checks process status - Frontend expects
runningto mean "LAPI responding" - Backend returns
running: truemeaning only "process running" - MISMATCH: Frontend needs
lapi_readyfield to determine actual LAPI status
Why This is a Regression
- The original fix added LAPI readiness check to
Start()handler ✅ - But forgot to add the same check to
Status()handler ❌ - Frontend now uses
statusCrowdsec()for polling LAPI status - This endpoint doesn't actually verify LAPI connectivity
Impact
- Console enrollment section always shows "initializing" warning
- Enroll button is disabled even when LAPI is working
- Users cannot complete console enrollment despite CrowdSec being functional
Executive Summary
The Start() handler was correctly updated to wait for LAPI readiness before returning (lines 201-236 in crowdsec_handler.go):
// Start() now waits for LAPI and returns lapi_ready: true/false
c.JSON(http.StatusOK, gin.H{
"status": "started",
"pid": pid,
"lapi_ready": true, // NEW: indicates LAPI is ready
})
However, the Status() handler was NOT updated and still only checks process status (lines 287-294):
func (h *CrowdsecHandler) Status(c *gin.Context) {
ctx := c.Request.Context()
running, pid, err := h.Executor.Status(ctx, h.DataDir) // Only checks PID!
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
return
}
c.JSON(http.StatusOK, gin.H{"running": running, "pid": pid}) // Missing lapi_ready!
}
Root Cause Analysis
The Executor's Status() Method
The DefaultCrowdsecExecutor.Status() in crowdsec_exec.go only checks:
- If PID file exists
- If process with that PID is running (via signal 0)
func (e *DefaultCrowdsecExecutor) Status(ctx context.Context, configDir string) (running bool, pid int, err error) {
b, err := os.ReadFile(e.pidFile(configDir))
if err != nil {
// Missing pid file is treated as not running
return false, 0, nil
}
// ... check if process is alive via signal 0 ...
return true, pid, nil
}
It does NOT check if LAPI HTTP endpoint is responding.
Frontend Expectation Mismatch
The frontend in CrowdSecConfig.tsx queries LAPI status:
const lapiStatusQuery = useQuery({
queryKey: ['crowdsec-lapi-status'],
queryFn: statusCrowdsec,
enabled: consoleEnrollmentEnabled && initialCheckComplete,
refetchInterval: 5000, // Poll every 5 seconds
retry: false,
})
And displays a warning based on running field (lines 207-231):
{lapiStatusQuery.data && !lapiStatusQuery.data.running && initialCheckComplete && (
<div className="..." data-testid="lapi-warning">
<p>CrowdSec Local API is initializing...</p>
</div>
)}
The Problem: The frontend checks lapiStatusQuery.data?.running expecting it to indicate LAPI connectivity. But the backend returns running: true which only means "process is running", not "LAPI is responding".
Evidence Chain
| Component | File | Line | Returns | Actually Checks |
|---|---|---|---|---|
| Backend Handler | crowdsec_handler.go | 287-294 | {running, pid} |
Process running via PID |
| Backend Executor | crowdsec_exec.go | 65-87 | (running, pid, err) |
PID file + signal 0 |
| Frontend API | crowdsec.ts | 18-21 | resp.data |
N/A (passthrough) |
| Frontend Query | CrowdSecConfig.tsx | 71-77 | lapiStatusQuery.data |
Checks .running field |
| Frontend UI | CrowdSecConfig.tsx | 207-231 | Shows warning | !running |
Bug: Frontend interprets running as "LAPI responding" but backend returns "process running".
Detailed Analysis: Why Warning Always Shows
Looking at the conditional again:
{lapiStatusQuery.data && !lapiStatusQuery.data.running && initialCheckComplete && (
This shows the warning when:
lapiStatusQuery.datais truthy ✓!lapiStatusQuery.data.runningis truthy (i.e.,runningis falsy)initialCheckCompleteis truthy ✓
Re-analyzing: If running: true, then !true = false, so warning should NOT show.
But user reports it DOES show!
Possible causes:
- Process not actually running: The
Status()endpoint returnsrunning: falsebecause CrowdSec process crashed or PID file is missing/stale - Different
runningfield: Frontend might be checking a different property - Query state issue: React Query might be returning stale data
Most Likely: Looking at the message being displayed:
"CrowdSec Local API is initializing..."
This message was designed for the case where process IS running but LAPI is NOT ready yet. But the current conditional shows it when running is false!
The Fix Needed: The conditional should check:
- Process running (
running: true) AND - LAPI not ready (
lapi_ready: false)
NOT just:
- Process not running (
running: false)
The Complete Fix
Files to Modify
- Backend: backend/internal/api/handlers/crowdsec_handler.go
- Frontend API: frontend/src/api/crowdsec.ts
- Frontend UI: frontend/src/pages/CrowdSecConfig.tsx
- Tests: backend/internal/api/handlers/crowdsec_handler_test.go
Change 1: Backend Status Handler
File: backend/internal/api/handlers/crowdsec_handler.go
Location: Lines 287-294
Before:
// Status returns simple running state.
func (h *CrowdsecHandler) Status(c *gin.Context) {
ctx := c.Request.Context()
running, pid, err := h.Executor.Status(ctx, h.DataDir)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
return
}
c.JSON(http.StatusOK, gin.H{"running": running, "pid": pid})
}
After:
// Status returns running state including LAPI availability check.
func (h *CrowdsecHandler) Status(c *gin.Context) {
ctx := c.Request.Context()
running, pid, err := h.Executor.Status(ctx, h.DataDir)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
return
}
// Check LAPI connectivity if process is running
lapiReady := false
if running {
args := []string{"lapi", "status"}
if _, err := os.Stat(filepath.Join(h.DataDir, "config.yaml")); err == nil {
args = append([]string{"-c", filepath.Join(h.DataDir, "config.yaml")}, args...)
}
checkCtx, cancel := context.WithTimeout(ctx, 2*time.Second)
_, checkErr := h.CmdExec.Execute(checkCtx, "cscli", args...)
cancel()
lapiReady = (checkErr == nil)
}
c.JSON(http.StatusOK, gin.H{
"running": running,
"pid": pid,
"lapi_ready": lapiReady,
})
}
Change 2: Frontend API Type
File: frontend/src/api/crowdsec.ts
Location: Lines 18-21
Before:
export async function statusCrowdsec() {
const resp = await client.get('/admin/crowdsec/status')
return resp.data
}
After:
export interface CrowdSecStatus {
running: boolean
pid: number
lapi_ready: boolean
}
export async function statusCrowdsec(): Promise<CrowdSecStatus> {
const resp = await client.get<CrowdSecStatus>('/admin/crowdsec/status')
return resp.data
}
Change 3: Frontend CrowdSecConfig Conditional Logic
File: frontend/src/pages/CrowdSecConfig.tsx
Location: Lines 207-231
Before:
{/* Warning when CrowdSec LAPI is not running */}
{lapiStatusQuery.data && !lapiStatusQuery.data.running && initialCheckComplete && (
<div className="flex items-start gap-3 p-4 bg-yellow-900/20 border border-yellow-700/50 rounded-lg" data-testid="lapi-warning">
<AlertTriangle className="w-5 h-5 text-yellow-400 flex-shrink-0 mt-0.5" />
<div className="flex-1">
<p className="text-sm text-yellow-200 font-medium mb-2">
CrowdSec Local API is initializing...
</p>
<p className="text-xs text-yellow-300 mb-3">
The CrowdSec process is running but the Local API (LAPI) is still starting up.
This typically takes 5-10 seconds after enabling CrowdSec.
{lapiStatusQuery.isRefetching && ' Checking again in 5 seconds...'}
</p>
<div className="flex gap-2">
<Button
variant="secondary"
size="sm"
onClick={() => lapiStatusQuery.refetch()}
disabled={lapiStatusQuery.isRefetching}
>
Check Now
</Button>
{!status?.crowdsec?.enabled && (
<Button
variant="secondary"
size="sm"
onClick={() => navigate('/security')}
>
Go to Security Dashboard
</Button>
)}
</div>
</div>
</div>
)}
After:
{/* Warning when CrowdSec process is running but LAPI is not ready */}
{lapiStatusQuery.data && lapiStatusQuery.data.running && !lapiStatusQuery.data.lapi_ready && initialCheckComplete && (
<div className="flex items-start gap-3 p-4 bg-yellow-900/20 border border-yellow-700/50 rounded-lg" data-testid="lapi-warning">
<AlertTriangle className="w-5 h-5 text-yellow-400 flex-shrink-0 mt-0.5" />
<div className="flex-1">
<p className="text-sm text-yellow-200 font-medium mb-2">
CrowdSec Local API is initializing...
</p>
<p className="text-xs text-yellow-300 mb-3">
The CrowdSec process is running but the Local API (LAPI) is still starting up.
This typically takes 5-10 seconds after enabling CrowdSec.
{lapiStatusQuery.isRefetching && ' Checking again in 5 seconds...'}
</p>
<div className="flex gap-2">
<Button
variant="secondary"
size="sm"
onClick={() => lapiStatusQuery.refetch()}
disabled={lapiStatusQuery.isRefetching}
>
Check Now
</Button>
</div>
</div>
</div>
)}
{/* Warning when CrowdSec is not running at all */}
{lapiStatusQuery.data && !lapiStatusQuery.data.running && initialCheckComplete && (
<div className="flex items-start gap-3 p-4 bg-red-900/20 border border-red-700/50 rounded-lg" data-testid="crowdsec-not-running-warning">
<AlertTriangle className="w-5 h-5 text-red-400 flex-shrink-0 mt-0.5" />
<div className="flex-1">
<p className="text-sm text-red-200 font-medium mb-2">
CrowdSec is not running
</p>
<p className="text-xs text-red-300 mb-3">
Please enable CrowdSec using the toggle switch in the Security dashboard before enrolling in the Console.
</p>
<Button
variant="secondary"
size="sm"
onClick={() => navigate('/security')}
>
Go to Security Dashboard
</Button>
</div>
</div>
)}
Change 4: Update Enrollment Button Disabled State
File: frontend/src/pages/CrowdSecConfig.tsx
Location: Lines 255-289 (Enroll, Rotate key, and Retry enrollment buttons)
Before:
disabled={isConsolePending || (lapiStatusQuery.data && !lapiStatusQuery.data.running) || !enrollmentToken.trim()}
After:
disabled={isConsolePending || (lapiStatusQuery.data && !lapiStatusQuery.data.lapi_ready) || !enrollmentToken.trim()}
Also update the title attributes:
Before:
title={
lapiStatusQuery.data && !lapiStatusQuery.data.running
? 'CrowdSec LAPI must be running to enroll'
: ...
}
After:
title={
lapiStatusQuery.data && !lapiStatusQuery.data.lapi_ready
? 'CrowdSec LAPI must be running to enroll'
: ...
}
Testing Steps
Unit Test: Backend Status Handler
Add test in backend/internal/api/handlers/crowdsec_handler_test.go:
func TestCrowdsecHandler_Status_IncludesLAPIReady(t *testing.T) {
mockExec := &fakeExec{running: true, pid: 1234}
mockCmdExec := &mockCommandExecutor{returnErr: nil} // cscli lapi status succeeds
handler := &CrowdsecHandler{
Executor: mockExec,
CmdExec: mockCmdExec,
DataDir: "/app/data",
}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest(http.MethodGet, "/admin/crowdsec/status", nil)
handler.Status(c)
assert.Equal(t, http.StatusOK, w.Code)
var response map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &response)
assert.True(t, response["running"].(bool))
assert.Equal(t, float64(1234), response["pid"].(float64))
assert.True(t, response["lapi_ready"].(bool)) // NEW: Check lapi_ready is present and true
}
func TestCrowdsecHandler_Status_LAPINotReady(t *testing.T) {
mockExec := &fakeExec{running: true, pid: 1234}
mockCmdExec := &mockCommandExecutor{returnErr: errors.New("connection refused")} // cscli lapi status fails
handler := &CrowdsecHandler{
Executor: mockExec,
CmdExec: mockCmdExec,
DataDir: "/app/data",
}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest(http.MethodGet, "/admin/crowdsec/status", nil)
handler.Status(c)
assert.Equal(t, http.StatusOK, w.Code)
var response map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &response)
assert.True(t, response["running"].(bool))
assert.Equal(t, float64(1234), response["pid"].(float64))
assert.False(t, response["lapi_ready"].(bool)) // LAPI not ready
}
func TestCrowdsecHandler_Status_ProcessNotRunning(t *testing.T) {
mockExec := &fakeExec{running: false, pid: 0}
mockCmdExec := &mockCommandExecutor{}
handler := &CrowdsecHandler{
Executor: mockExec,
CmdExec: mockCmdExec,
DataDir: "/app/data",
}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest(http.MethodGet, "/admin/crowdsec/status", nil)
handler.Status(c)
assert.Equal(t, http.StatusOK, w.Code)
var response map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &response)
assert.False(t, response["running"].(bool))
assert.False(t, response["lapi_ready"].(bool)) // LAPI can't be ready if process not running
}
Manual Testing Procedure
-
Start Fresh:
docker compose down -v docker compose up -d -
Enable CrowdSec:
- Go to Security dashboard
- Toggle CrowdSec ON
- Wait for toast "CrowdSec started and LAPI is ready"
-
Navigate to Config:
- Click "Config" button
- Verify NO "initializing" warning shows
- Console enrollment section should be enabled
-
Verify API Response:
curl -s http://localhost:8080/api/v1/admin/crowdsec/status | jqExpected:
{ "running": true, "pid": 123, "lapi_ready": true } -
Test LAPI Down Scenario:
- SSH into container:
docker exec -it charon bash - Stop CrowdSec:
pkill -f crowdsec - Call API:
curl -s http://localhost:8080/api/v1/admin/crowdsec/status | jq - Expected:
{"running": false, "pid": 0, "lapi_ready": false} - Refresh CrowdSecConfig page
- Should show "CrowdSec is not running" error (red)
- SSH into container:
-
Test Restart Scenario:
- Re-enable CrowdSec via Security dashboard
- Immediately navigate to CrowdSecConfig
- Should show "initializing" briefly (yellow) then clear when
lapi_ready: true
Risk Assessment
| Change | Risk | Mitigation |
|---|---|---|
| Backend Status handler modification | Low | Status handler is read-only, adds 2s timeout check |
| LAPI check timeout (2s) | Low | Short timeout prevents blocking; async refresh handles retries |
| Frontend conditional logic change | Low | More precise state handling, clear error states |
| Type definition update | Low | TypeScript will catch any mismatches at compile time |
| Two separate warning states | Low | Better UX with distinct yellow (initializing) vs red (not running) |
Summary
Root Cause: The Status() endpoint was not updated when Start() was modified to check LAPI readiness. The frontend expects the status endpoint to indicate LAPI availability, but it only returns process status.
Fix: Add lapi_ready field to Status() response by checking cscli lapi status, update frontend to use this new field for the warning display logic.
Files Changed:
backend/internal/api/handlers/crowdsec_handler.go- Add LAPI check to Status()frontend/src/api/crowdsec.ts- Add TypeScript interface withlapi_readyfrontend/src/pages/CrowdSecConfig.tsx- Update conditional logic:- Yellow warning: process running, LAPI not ready
- Red warning: process not running
- No warning: process running AND LAPI ready
backend/internal/api/handlers/crowdsec_handler_test.go- Add unit tests
Estimated Time: 1-2 hours including testing
Commit Message:
fix: add LAPI readiness check to CrowdSec status endpoint
The Status() handler was only checking if the CrowdSec process was
running, not if LAPI was actually responding. This caused the
CrowdSecConfig page to always show "LAPI is initializing" even when
LAPI was fully operational.
Changes:
- Backend: Add `lapi_ready` field to /admin/crowdsec/status response
- Frontend: Add CrowdSecStatus TypeScript interface
- Frontend: Update conditional logic to check `lapi_ready` not `running`
- Frontend: Separate warnings for "initializing" vs "not running"
- Tests: Add unit tests for Status handler LAPI check
Fixes regression from crowdsec_lapi_error_diagnostic.md fixes.