AI agents need execution memory to prevent duplicate side effects during retries, timeouts, and crashes. This article explores the Execution Guard Pattern - a durable receipt-based approach that ensures irreversible actions execute exactly once.
AI agents don’t just think — they execute real-world actions. Payments. Trades. Emails. API calls. And under retries, timeouts, or crashes… they can execute the same action twice. Not because the model was wrong — because the system has no memory of execution.

The hidden failure mode
A typical failure path looks like this:
agent decides to call tool → tool executes side effect → response is lost (timeout / crash / disconnect) → system retries → side effect executes again
Now you have:
- duplicate payments
- duplicate trades
- duplicate emails
- duplicate API mutations
Not because the decision was wrong — because the execution layer has no durable receipt.
Retries are correct — and still dangerous
Retries are necessary for reliability. But retries + irreversible side effects without a guard = replay risk. The system cannot confidently answer: "Did this action already happen?" So it does the only thing it can:
→ tries again
That’s fine for reads. It’s dangerous for writes.
The Execution Guard Pattern
The fix is not prompt engineering. It’s an execution boundary around side effects.
Pattern: decision → deterministic request_id → execution guard → if receipt exists → return prior result → else → execute once → store receipt
Instead of asking the model to "be careful," the system itself becomes replay-safe.
The four required properties
For this pattern to work, you need four things:
1) Deterministic request identity
Every logical action must map to the same request_id across retries. If the same payment, email, trade, or tool call is retried, it must resolve to the same identity.
2) Durable receipt storage
You need a place to persist what happened. Postgres works well for this because it gives you:
- durable writes
- transactional boundaries
- strong uniqueness guarantees
- queryable auditability
Without durable receipts, retries are guesswork.
3) Atomic claim → execute → complete boundary
The system needs a clear execution boundary: claim the operation → execute the side effect once → persist the result / receipt
That boundary is what prevents:
- concurrent replays
- duplicate workers
- race-condition duplicates
- "two consumers did the same thing" bugs
4) Replay returns the prior result
If the same logical action comes in again, you should not execute it again. You should return the prior result.
That turns:
- retries
- redelivery
- replay uncertain completion
into:
- safe re-entry instead of duplicate side effects
What this is NOT
This is not:
- moderation
- prompt safety
- RBAC approval workflows
- hallucination prevention
It solves one thing: "Did this irreversible action already happen?"
That question shows up everywhere once agents or automations start calling real tools.
Where this matters most
This pattern matters anywhere your system causes real-world side effects:
- webhook handlers
- billing / payment flows
- async workers / queues
- workflow / automation systems
- AI agent tool calls
- external API mutations
- order / booking / ticket creation
- notifications and email sends
In other words: anything that should happen once, even if the system retries.
Why this keeps showing up
Modern systems are:
- distributed
- async
- retry-heavy
- failure-prone
- full of uncertain completion
So "exactly once" does not happen naturally. You have to build it explicitly.
And once you add:
- AI agents
- autonomous workflows
- tool-calling systems
…the need for an execution boundary gets even sharper. Because now a model can repeatedly decide to invoke something that has real-world consequences.
A practical implementation direction
In many systems, this can be implemented with:
- a Postgres-backed receipt table
- a stable operation / request ID
- a guard layer around side-effecting functions
That turns:
- unsafe retries
into:
- safe replays
This doesn’t require rewriting your whole system. It usually means identifying the small set of functions that can cause irreversible effects and wrapping them with a durable execution boundary. That’s where the leverage is.
Closing thought
If an AI agent can call tools, it needs more than reasoning. It needs execution memory. Otherwise: retries will eventually execute something twice.
Execution Risk Audit
I’m currently looking at systems where retries, webhooks, workers, workflows, or AI agents can replay irreversible actions. If your system has paths where you can’t confidently answer: "Did this action already happen?" that’s exactly the kind of problem I’m focused on.
Especially interested in:
- duplicate webhook execution
- retry-safe billing flows
- workflow steps with uncertain completion
- AI agents calling side-effecting tools


Comments
Please log in or register to join the discussion