Outbound API Integrations — Required Safeguards

The full safeguard checklist for any project hitting an external API. Rate ceilings, cost accounting, circuit breaker, dry-run mode, payload bounds, secrets handling, observability — the list you wire up before the first real call.

Origin: COMSAT Haiku supervisor build Domain: engineering · agents · safety Status: canonical

Every time COMSAT (or any other Tier 0/1 project) starts hitting a remote service — Anthropic, Slack, Stripe, Notion, an MCP server, anything — these safeguards must be wired before the first real call. They are not optional.

This came from the COMSAT Haiku supervisor build: one outbound destination, one classifier, one polling loop. Easy to imagine that loop firing 1,000× because of a bug, a stuck workspace, or an mtime that doesn't change. So we encode this once and reuse it.

When to apply

Apply when any of the following becomes true in a project:

Code adds import anthropic / OpenAI / any external SDK
Code adds fetch(...) / axios(...) / curl ... to a non-loopback URL
Code adds an MCP server reference
A polling loop will trigger any of the above

The pattern applies even for "just a quick test" — once the call lands in the codebase, the loop is one mistake away from production.

The hard checklist

Tier promotion (the meta-safeguard)

Every project has a CLAUDE.md that states its safety tier (0–3). When you add an outbound dependency:

Bump the tier in CLAUDE.md.
Document the specific destination(s) added (e.g. "calls api.anthropic.com only").
List the data that crosses the wire (e.g. "truncated assistant turn, max 4000 chars; no customer data").
Note what would re-trigger another tier review (e.g. "adding any new outbound destination").

If the tier doesn't move with the code, you'll silently slip from sandbox → production-adjacent without a corresponding safety review.

Idempotency / change detection

Don't call when nothing has changed. Hash inputs (content sha256, mtime, version number) and skip duplicate work.
In-flight deduplication: a Set<id> of pending calls. Don't queue while one is already running for that key.
State the trigger condition explicitly. "Fires on transition X → Y" is good. "Fires every poll" is almost always a bug.

Rate ceilings

Hard hourly cap (default starting point: 200 calls/hour for a per-session classifier; tune to use case).
Hard daily $ ceiling (default: $1.00 — cheap to raise if needed, expensive to discover too late).
Both ceilings stored in config so the user can override without a code change.
When the ceiling trips, log it once and skip silently. Don't crash or retry.

Cost accounting

Track tokens (input + cached + output) and translate to USD per call.
Hard-code per-model pricing constants with a comment listing the model and cache date — re-baseline on model change.
Surface running totals to the UI (today, this hour, last call).
On any model swap, re-check the pricing constants — they DO move quietly between minor versions.

Circuit breaker

Track consecutive failures (network errors, 5xx, timeouts).
After N consecutive failures (default 5), trip the breaker — pause calls for a window (default 5 min).
Auto-close the breaker after the window. Don't require user intervention.
Surface the breaker state in the UI.

Per-call timeouts

Bound individual call duration. SDK defaults are often "no timeout" or absurdly long.
Use the SDK's request-timeout option, not a wrapping Promise.race. SDKs handle internal retries / streaming chunking that wrappers don't see.

Backoff on errors

Exponential backoff on 429 and 5xx. Most SDKs do this automatically — verify, don't assume.
Honor Retry-After headers when present.
Don't retry on 400 / 401 / 403 — those won't fix themselves.
Use typed exception classes, not error-message string matching.

Dry-run mode

Provide a dryRun: true config flag that logs what would be called without firing the request.
First-run defaults: dry-run ON for ~24h or for the first N transitions, until the loop's behavior is confirmed.
Distinct log prefix ([supervisor:dry-run], [stripe:dry-run], etc.) so production and dry-run never get confused.

Privacy and payload bounds

Truncate payloads to a hard cap before sending (e.g. 4 KB for classification). Never send unbounded user content.
Strip secrets from anything that could end up in a payload — **REDACTED** placeholders for API keys, env vars, file paths that contain tokens.
Document what crosses the wire in CLAUDE.md (see Tier promotion above).
If the SDK supports prompt caching, design the prompt prefix to be cacheable from day one — don't interpolate Date.now() or per-request UUIDs into the system prompt.

Secrets handling

API keys live in config files with mode 0600, OR an env var, OR a system keychain. Never in source.
When echoing config back to a UI / renderer, mask the key (********). Never round-trip the real key through IPC or HTTP responses.
Add the key file path to .gitignore.
When swapping keys, the in-memory client must be rebuilt. Stale clients with old keys cause confusing 401s.

Activity log

Every outbound call logged with timestamp, destination, cost, result class (ok / 4xx / 5xx / timeout).
Log written somewhere durable (file, structured logger, telemetry). Console-only doesn't survive a restart.
Don't log full payloads by default — too easy to leak secrets. Hashes / sizes only unless dry-run is on.

Opt-in / off-by-default for new projects

First-run default: outbound integration disabled. User must explicitly enable in settings.
If a key is found in env, prompt the user before using it for the first time.
Document the cost ballpark in the UI hint where the user enables the feature.

Heuristic / offline fallback

If the API is unavailable (no key, network down, breaker tripped), the project must still function.
Implement a regex / rule-based fallback that handles the obvious cases for free.
Surface which path the result came from in the UI ([haiku] vs [heuristic]).

Observability

Cost/call counters surfaced in the app UI, not just logs.
State of the breaker visible.
Last error visible (truncated, no secrets).
At minimum: today's spend, today's call count, breaker state.

Don't auto-retry side-effecting calls

Read calls (classify, search, fetch) are safe to retry.
Write calls (post a message, charge a card, send an email) are NOT safe to blind-retry. Add idempotency keys or explicit user confirmation.
The SDK's auto-retry counts as "blind" for write calls. Disable it on writes if the API doesn't support idempotency keys.

Concurrency

Cap parallel calls. A polling loop firing across 10 workspaces should not issue 10 concurrent calls — sequence them or limit fan-out.
If you have a polling-style loop, the inner work must be idempotent (see Idempotency above) so a slow call doesn't pile up duplicates.

Working example: COMSAT Haiku supervisor

The reference implementation in COMSAT hits all of the above:

Tier promoted from 0 → 1 in the project's CLAUDE.md when the supervisor was added.
mtime-equality check skips duplicate classification of the same assistant turn.
In-flight Set<workspaceId> prevents overlapping calls per session.
maxCallsPerHour (default 200) and dailyUsdCeiling (default $1.00), configurable.
Token-cost accounting with hard-coded Haiku 4.5 pricing constants.
Circuit breaker after 5 consecutive failures, 5-minute pause.
Truncates payloads to 4,000 chars.
API key masked in IPC responses (********).
Logs [supervisor] prefix; dry-run uses [supervisor:dry-run].
Heuristic fallback when key absent or breaker open.
Run stats exposed to renderer for visibility.

Use it as the template the next time you add an outbound integration. Copy the structure, swap the model and the classification logic, keep the safeguards.

How to use this file

When starting any task that adds an outbound dependency:

Read this file end to end.
Walk every checkbox. If you can't honor one, decide explicitly whether to skip or add it later — and document the decision.
Apply the same structure as the COMSAT supervisor (rate buckets, circuit breaker, cost accounting).
Update the destination project's CLAUDE.md with the tier change.
Reference this file in the project's commit message or PR description.

If a future Claude session adds an outbound call without going through this list, that's a regression — push back and route through this file.