Circuit Breaker — Worker On/Off Control
Circuit Breaker — Worker On/Off Control Date: 2026 04 27 Branch: feature/marketing agents Status: Approved for implementation Overview Add a circuit breaker to every worker agent — a flag that stops the runner from executing without stopping the systemd timer. Combined with a global kill switch that terminates all running processes immediately. Two mechanism...
Date: 2026-04-27 Branch: feature/marketing-agents Status: Approved for implementation
---
Add a circuit breaker to every worker agent — a flag that stops the runner from executing without stopping the systemd timer. Combined with a global kill switch that terminates all running processes immediately.
- Per-agent circuit breaker — graceful pause, takes effect on next timer tick
- Global kill switch — immediate process termination + all breakers open
---
Migration: supabase/migrations/034_circuit_breaker.sql
Add one column to agent_state:
``sql
ALTER TABLE agent_state
ADD COLUMN circuit_breaker TEXT NOT NULL DEFAULT 'closed'
CHECK (circuit_breaker IN ('open', 'closed'));
``
'closed' = normal (agent runs). 'open' = paused (agent skips). Default 'closed' — existing agents unaffected after migration.
No separate global flag. "All paused" = all rows have circuit_breaker = 'open'. Resume = patch rows back to 'closed'.
---
In agents/shared/runner.py, add _check_circuit_breaker(sb) called at the very top of _run() — before fetch_inbox, fetch_work_items, or any other logic.
- If
circuit_breaker = 'open': log"circuit breaker open — skipping", setagent_state.status = 'paused', return immediately - If Supabase is unreachable: fail to
'closed'(agents keep running — don't block the platform on a DB hiccup) - If no row exists for this agent: treat as
'closed'
``python
def _check_circuit_breaker(self, sb: Client) -> bool:
"""Returns True if OK to run (closed), False if paused (open)."""
try:
row = sb.table("agent_state").select("circuit_breaker") \
.eq("agent", self.AGENT_ID).limit(1).execute()
if row.data:
return row.data[0].get("circuit_breaker", "closed") == "closed"
except Exception as e:
log.warning("circuit_breaker check failed — defaulting to closed: %s", e)
return True
``
_run() calls this first:
``python
def _run(self, sb: Client, creds: dict) -> None:
if not self._check_circuit_breaker(sb):
log.info("%s circuit breaker open — skipping", self.AGENT_ID)
self._update_agent_state(sb, "paused")
return
# ... rest of existing _run logic
``
---
Add POST /kill-agents to vps/hcs-post-server.js.
Two-step execution:
1. Patch all rows in agent_state → circuit_breaker = 'open'
2. pkill -f "agents.shared.runner" — terminates all running runner processes
Response:
``json
{ "ok": true, "killed": true }
``
pkill exit code 0 = processes killed, 1 = nothing was running — both are valid. Either way, breakers are open and nothing restarts.
---
scripts/circuit-breaker.js — run with node --env-file=.env scripts/circuit-breaker.js.
```
node scripts/circuit-breaker.js --agent quinn --open
node scripts/circuit-breaker.js --agent quinn --close
node scripts/circuit-breaker.js --resume-all
node scripts/circuit-breaker.js --kill ```
--open/--close: patchagent_statevia Supabase (REDKEY_SUPABASE_URL+REDKEY_SUPABASE_SECRET_KEY)--kill: POST tohttp://87.99.154.64:3001/kill-agents--resume-all: patch allagent_staterows tocircuit_breaker = 'closed'- Missing
--agentwith--open/--close: print usage and exit 1
---
- Pause icon when
circuit_breaker = 'closed'(agent is running normally) - Play icon when
circuit_breaker = 'open'(agent is paused) - Clicking PATCHes
agent_state.circuit_breakerfor that agent via Supabase - When
open, show a small orange "paused" badge on the agent row
Two buttons in the cockpit header:
| Button | Color | Action |
|--------|-------|--------|
| Resume All | Green/neutral | Patches all agent_state.circuit_breaker = 'closed' |
| Kill All | Red | Confirmation dialog → POST /kill-agents |
Confirmation dialog text: *"This will terminate all running workers immediately. Continue?"*
On Kill All success: cockpit reflects all agents as paused (orange badges). Resume All clears them.
---