{
  "id": "2026-04-27-hcs-first-claim-path-61002d5e6c",
  "scope": "redkey",
  "source_of_truth": "repo",
  "source_path": "docs/specs/2026-04-27-hcs-first-claim-path.md",
  "source_kind": "markdown",
  "visibility": "internal",
  "renderer_id": "design_doc.dreamborn-forge.generated.v1",
  "design_system": "dreamborn-design-system:forge",
  "generated_at": "2026-05-09T13:00:55.682Z",
  "artifact_type": "design_doc",
  "schema_version": "design_doc.generated.v1",
  "title": "HCS-First Claim Path",
  "summary": "HCS First Claim Path Date: 2026 04 27 Problem Two failure modes observed in production: 1. Agents do work without having claimed — the Supabase fallback in fetch work items queries agent tasks directly for status IN ('created', 'assigned') with no claimed by check. An agent can pull a task from Supabase and start executing without ever posting task.claim to ...",
  "format_source": "markdown",
  "sections": [
    {
      "title": "HCS-First Claim Path",
      "level": 1,
      "body": "**Date:** 2026-04-27\n\n---"
    },
    {
      "title": "Problem",
      "level": 2,
      "body": "Two failure modes observed in production:\n\n1. **Agents do work without having claimed** — the Supabase fallback in `fetch_work_items` queries `agent_tasks` directly for `status IN ('created', 'assigned')` with no `claimed_by` check. An agent can pull a task from Supabase and start executing without ever posting `task.claim` to HCS. No atomic race, no on-chain record.\n\n2. **Stale tasks executed repeatedly** — the same fallback re-serves rows that are complete on HCS but not yet reflected in Supabase (listener lag), or rows that were never dispatched via HCS at all. Multiple agents pick up the same row.\n\nRoot cause: **Supabase is acting as a second work queue**, parallel to HCS. Agents that miss the HCS `task.available` message fall through to Supabase and pick up work that bypasses the claim protocol entirely.\n\n---"
    },
    {
      "title": "Design Principle",
      "level": 2,
      "body": "HCS is the only work queue. Supabase is a read mirror for the cockpit. Never the other way around.\n\n```\nHCS role topic\n  task.available → mirror node (read)\n  → _find_available_tasks (pure HCS message scan)\n  → post task.claim (HCS write via claim-task.js)\n  → _verify_claim (mirror node read — lowest seq wins)\n  → fetch brief from HFS (brief_file_id from HCS message)\n  → execute\n  → post task.complete (HCS write)\n\nSupabase: written by listener. Read by cockpit. Not consulted during execution.\n```\n\nMirror node reads are reads of HCS — they are explicitly allowed and used throughout.\n\n---"
    },
    {
      "title": "agents/shared/hedera_queue.py",
      "level": 3,
      "body": "**Remove `_is_done_in_supabase`** — this function gates the claim on a Supabase read before posting to HCS. It was a band-aid for the old `CLAIM_TIMEOUT_SECONDS` stale-claim bug (now fixed). `_find_available_tasks` already answers \"is this task available?\" using HCS messages alone — the Supabase check is redundant and adds a DB round-trip to every claim attempt.\n\n**Remove `sb` parameter from `poll_and_claim`** — Supabase is no longer consulted in the claim path. Brief is fetched from HFS via `brief_file_id` (already implemented — `_fetch_hfs` called after winning the claim)."
    },
    {
      "title": "agents/shared/runner.py",
      "level": 3,
      "body": "**Remove `_enrich_from_agent_tasks`** — this method fetches the full `agent_tasks` row from Supabase after winning the claim and overwrites the work item. It makes Supabase the primary brief source and HFS the fallback, which is backwards. The brief lives on HFS (referenced by `brief_file_id` in the `task.available` HCS message). After this change, the brief comes from `_fetch_hfs` in `hedera_queue.py`.\n\n**Remove the Supabase fallback block** — the `# Fallback: agent_tasks table` block in `fetch_work_items` is removed entirely. If there is no `task.available` on the HCS role topic, there is no work. The fallback masked dispatch bugs (tasks that existed in Supabase but were never posted to HCS) and allowed unclaimed execution.\n\n**CRM tasks path is preserved** — `_fetch_crm_tasks` uses a dedicated operational Supabase table, not `agent_tasks`. It is Arlo-specific and unrelated to the HCS role topic claim model. Not changed.\n\n---"
    },
    {
      "title": "What Stays",
      "level": 2,
      "body": "- `_fetch_messages` — mirror node REST read. Fine.\n- `_find_available_tasks` — pure HCS message scan. Fine.\n- `_post_claim` — HCS write via claim-task.js. Fine.\n- `_verify_claim` — mirror node read to confirm lowest claim seq. Fine.\n- `_fetch_hfs` — fetches brief from Hedera File Service after winning claim. Fine.\n- `limit=250` fix — stays.\n- Permanent claim semantics (no stale timeout) — stays.\n\n---"
    },
    {
      "title": "Brief Source After This Change",
      "level": 2,
      "body": "Brief arrives via the HCS `task.available` message → `brief_file_id` → `_fetch_hfs`. All dispatch scripts (via `dispatchTask()` and existing scripts) upload the brief to HFS and include `brief_file_id` in the HCS message. This is already the pattern for all Phase 3 tasks.\n\nTasks without `brief_file_id` will log a warning and execute with an empty brief — this surfaces dispatch bugs instead of hiding them.\n\n---"
    },
    {
      "title": "Files Changed",
      "level": 2,
      "body": "| File | Change |\n|---|---|\n| `agents/shared/hedera_queue.py` | Remove `_is_done_in_supabase`, remove `sb` param from `poll_and_claim` |\n| `agents/shared/runner.py` | Remove `_enrich_from_agent_tasks`, remove Supabase fallback from `fetch_work_items` |"
    }
  ],
  "html_path": "artifacts/2026-04-27-hcs-first-claim-path-61002d5e6c.html",
  "json_path": "artifacts/2026-04-27-hcs-first-claim-path-61002d5e6c.json"
}