Article illustration 2

Anyone who's built LLM-powered agents knows the frustration: you architect a perfect tool sequence, write pristine descriptions, and watch helplessly as your agent ignores instructions, skips critical steps, or hallucinates its way through workflows. This disobedience isn't just annoying—it breaks production systems. After hitting these walls while developing Pamba, a video generation agent, engineers uncovered seven unconventional tactics that finally forced their LLMs to behave.

Why Tool Calling Goes Rogue

The core problem isn't intelligence—it's attention management. LLMs process instructions in discrete chunks, often losing crucial context between tool calls. As one developer lamented:

"We supplied checkRequirements() assuming it would run before composeCreative(). Despite clear instructions, this almost never happened, and worsened as we added tools."

Traditional solutions like verbose system prompts often fail because LLMs deprioritize them once tool-calling activates. The breakthrough came from treating tools not just as functions, but as attention-directing mechanisms.

7 Tactics for Enforcing Order

1️⃣ Parameter Placeholders as Reminders

Embed unused parameters that explicitly flag conditions the agent might overlook:

@Tool("Generate a video")
fun generateVideo(
    @P("Whether user wants to overlay this video") // Purely for attention
    userWantsOverlay: Boolean,
    @P("Creative concept")
    concept: String
) { /* ... */ }

Why it works: Forces the LLM to acknowledge critical context before proceeding.

2️⃣ Response-Locked Halting

Prepend tool responses with explicit stop commands:

@Tool("Confirm credit usage")
fun confirmCreditUsage() {
    return """
    DO NOT MAKE MORE TOOL CALLS
    Return to user: 
    "This costs 5 credits. Proceed?"
    """
}

Why it works: Overrides autonomous momentum by embedding directives in the tool's output.

3️⃣ Step Numbering + Mandatory Tags

Bake sequencing into tool names:

@Tool("MANDATORY Step 3: Check video requirements")
fun checkRequirements() { /* ... */ }

Why it works: LLMs parse numbered lists more reliably than abstract dependencies.

4️⃣ Tool-Based Context Injection

Replace flaky system prompts with dedicated context tools:

@Tool("MANDATORY Step 1: Retrieve system capabilities")
fun getCapabilities(): SystemCapabilities { /* ... */ }

Why it works: Makes context retrieval an explicit, trackable action.

5️⃣ Artificial Parameter Dependencies

Enforce order through required tokens:

@Tool("MANDATORY Step 4: Compose concept")
fun composeCreative(
    @P("Token from checkRequirements")
    requirementsToken: String // Blocks without prior step
) { /* ... */ }

Why it works: Creates hard technical dependencies instead of hoping for logical ones.

6️⃣ Execution Sanity Checks

Build guardrails into tool logic:

@Tool("Generate preview")
fun generatePreview(
    @P("User confirmed script?")
    userConfirmed: Boolean
) {
    if (!userConfirmed) 
        return "STOP: User hasn't confirmed. Ask first."
    // Else proceed
}

Why it works: Tools self-validate prerequisites and instruct recovery.

7️⃣ Explicit Conditional Tagging

Force condition acknowledgment in responses:

// In agent's output:
dogInScene = true
[Rest of generated content]

Why it works: Writing variables makes implicit reasoning explicit and actionable.

The New Discipline of Agent Engineering

These patterns solve ~80% of tool-calling disobedience by treating LLMs not as reasoning engines but as context-amnesiac executors needing constant redirection. While future models may reduce these hacks, today they're essential for production systems. As the Pamba team discovered, agent reliability requires designing for the LLM's limitations—not just its capabilities. The era of hoping agents "just figure it out" is over; structured command-and-control is the new imperative.

Source: Tactics for Agent Obedience