๐Ÿ‘พ Why Guardrails Are Essential for AI Agents

Protect your AI agents from misuse and costly errors. Learn how to add input/output guardrails in the OpenAI Agents SDK for safer, cheaper workflows.

Futuristic control room with humanoid robots in front of holographic shields symbolizing guardrails

Source: MBA via Imagen 4

AI agents are powerful, but without safeguards, they can process malicious inputs, generate inappropriate outputs, or waste expensive compute resources.

Guardrails solve this. Letโ€™s take the OpenAI Agents SDK as an example. It lets us:

  • Validate user inputs before processing

  • Check agent outputs before delivery

  • Run parallel safety checks with fast/cheap models

Below, let's implement them!

Consider a customer support agent using an expensive GPT-4 model. We don't want users asking it to do their math homework โ€“ that's a waste of resources and off-topic.

Input Guardrails run in parallel with your main agent and can immediately halt execution if they detect violations.

Here's how we define one:

from pydantic import BaseModel
from agents import (
    Agent, GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
    RunContextWrapper, Runner, input_guardrail
)

class HomeworkDetectionOutput(BaseModel):
    is_homework: bool
    reasoning: str

# Fast guardrail agent using a cheaper model
guardrail_agent = Agent(
    name="Homework detector",
    instructions="Detect if the user is asking for homework help.",
    output_type=HomeworkDetectionOutput,
)

@input_guardrail
async def homework_guardrail(
    ctx: RunContextWrapper[None], 
    agent: Agent, 
    input: str
) -> GuardrailFunctionOutput:
    result = await Runner.run(guardrail_agent, input, context=ctx.context)
    
    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.is_homework,
    )

Next, we attach this guardrail to our main customer support agent:

# Main agent with expensive model
support_agent = Agent(
    name="Customer support agent",
    instructions="Help customers with product questions and issues.",
    input_guardrails=[homework_guardrail],  # ๐Ÿ”’ Guardrail attached here
)

async def main():
    try:
        # This should trigger the guardrail
        await Runner.run(
            support_agent, 
            "Can you solve this equation: 2x + 3 = 11?"
        )
        print("Request processed successfully")
        
    except InputGuardrailTripwireTriggered:
        print("๐Ÿšจ Guardrail blocked inappropriate request!")

This produces a blocked request โ€“ the expensive model never runs!

The guardrail detected homework content and immediately raised an InputGuardrailTripwireTriggered exception, saving compute costs.

But what about checking outputs?

Output Guardrails work similarly but validate the agent's final response:

@output_guardrail
async def safety_guardrail(
    ctx: RunContextWrapper, 
    agent: Agent, 
    output: MessageOutput
) -> GuardrailFunctionOutput:
    # Check if output contains sensitive information
    safety_result = await Runner.run(
        safety_agent, 
        output.response, 
        context=ctx.context
    )
    
    return GuardrailFunctionOutput(
        output_info=safety_result.final_output,
        tripwire_triggered=safety_result.final_output.contains_sensitive_data,
    )

# Attach to agent
agent = Agent(
    name="Support agent",
    instructions="Help customers with their questions.",
    output_guardrails=[safety_guardrail],  # ๐Ÿ”’ Output validation
    output_type=MessageOutput,
)

And that's how we implement Guardrails in the OpenAI Agents SDK!

Key benefits:

โšก Parallel execution - guardrails don't slow down your main agent
๐Ÿ’ฐ Cost savings - block expensive model calls early
๐Ÿ›ก๏ธ Layered defense - combine multiple guardrails for robust protection
๐Ÿ”„ Exception handling - clean error management with try/catch blocks

Guardrails are essential for production AI agents, alongside proper authentication, access controls, and monitoring. They're your first line of defense against misuse and unexpected behavior.

More Resources

Blog: In-depth articles on AI workflows and practical strategies for growth
AI Tool Collection: Discover and compare validated AI solutions
Consultancy: I help you discover AI potential or train your team