Autonomous Agent Safety
Three mandatory prerequisites before shipping any autonomous agent — kill switch, deduplication, circuit breaker. From the runaway-loop incident that cost real dollars and required a manual webhook deletion to stop.
Before any autonomous agent feature ships, verify three things:
1. Kill switch
Hard stop mechanism — DB flag, admin endpoint, or similar. Checked between every action in the agent loop. Not a prompt instruction. Prompts are suggestions, not enforcement.
2. Message deduplication
Any webhook-triggered agent must deduplicate incoming messages. Webhook providers (Telegram, Stripe, etc.) retry delivery when the handler takes too long to respond. Without dedup, one message becomes N parallel invocations, each running their own retry loops.
3. Circuit breaker
After N failed attempts at the same action, stop and report. Never retry indefinitely. maxRounds is necessary but not sufficient — you need per-action failure counting, not just total rounds.
Bonus: timeout budget
Before shipping, calculate: (number of actions) × (time per action) vs configured timeout. If the math doesn't work, the feature will fail on every attempt and trigger retry cascading.
Incident reference
SuperNova Stagehand form fill. 12 fields × 10s/field = 120s. Tool timeout = 90s. Guaranteed failure → retry loop → Telegram webhook retry → cascading parallel loops for 10+ minutes. No kill switch. The only fix was deleting the webhook. Cost: ~$5 API burn, 15 minutes of incident response, manual webhook deletion.
Checklist (copy for PR review)
- Can I stop this agent mid-execution? How?
- Are incoming messages deduplicated?
- What happens after 2 consecutive failures of the same action?
- What's the timeout budget? (actions × time/action < timeout)
- What's the max concurrent invocations possible?