- The Agent Roundup
- Posts
- ๐พ Why Guardrails Are Essential for AI Agents
๐พ Why Guardrails Are Essential for AI Agents
Protect your AI agents from misuse and costly errors. Learn how to add input/output guardrails in the OpenAI Agents SDK for safer, cheaper workflows.

Source: MBA via Imagen 4
AI agents are powerful, but without safeguards, they can process malicious inputs, generate inappropriate outputs, or waste expensive compute resources.
Guardrails solve this. Letโs take the OpenAI Agents SDK as an example. It lets us:
Validate user inputs before processing
Check agent outputs before delivery
Run parallel safety checks with fast/cheap models
Below, let's implement them!
Consider a customer support agent using an expensive GPT-4 model. We don't want users asking it to do their math homework โ that's a waste of resources and off-topic.
Input Guardrails run in parallel with your main agent and can immediately halt execution if they detect violations.
Here's how we define one:
from pydantic import BaseModel
from agents import (
Agent, GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
RunContextWrapper, Runner, input_guardrail
)
class HomeworkDetectionOutput(BaseModel):
is_homework: bool
reasoning: str
# Fast guardrail agent using a cheaper model
guardrail_agent = Agent(
name="Homework detector",
instructions="Detect if the user is asking for homework help.",
output_type=HomeworkDetectionOutput,
)
@input_guardrail
async def homework_guardrail(
ctx: RunContextWrapper[None],
agent: Agent,
input: str
) -> GuardrailFunctionOutput:
result = await Runner.run(guardrail_agent, input, context=ctx.context)
return GuardrailFunctionOutput(
output_info=result.final_output,
tripwire_triggered=result.final_output.is_homework,
)
Next, we attach this guardrail to our main customer support agent:
# Main agent with expensive model
support_agent = Agent(
name="Customer support agent",
instructions="Help customers with product questions and issues.",
input_guardrails=[homework_guardrail], # ๐ Guardrail attached here
)
async def main():
try:
# This should trigger the guardrail
await Runner.run(
support_agent,
"Can you solve this equation: 2x + 3 = 11?"
)
print("Request processed successfully")
except InputGuardrailTripwireTriggered:
print("๐จ Guardrail blocked inappropriate request!")
This produces a blocked request โ the expensive model never runs!
The guardrail detected homework content and immediately raised an InputGuardrailTripwireTriggered
exception, saving compute costs.
But what about checking outputs?
Output Guardrails work similarly but validate the agent's final response:
@output_guardrail
async def safety_guardrail(
ctx: RunContextWrapper,
agent: Agent,
output: MessageOutput
) -> GuardrailFunctionOutput:
# Check if output contains sensitive information
safety_result = await Runner.run(
safety_agent,
output.response,
context=ctx.context
)
return GuardrailFunctionOutput(
output_info=safety_result.final_output,
tripwire_triggered=safety_result.final_output.contains_sensitive_data,
)
# Attach to agent
agent = Agent(
name="Support agent",
instructions="Help customers with their questions.",
output_guardrails=[safety_guardrail], # ๐ Output validation
output_type=MessageOutput,
)
And that's how we implement Guardrails in the OpenAI Agents SDK!
Key benefits:
โก Parallel execution - guardrails don't slow down your main agent
๐ฐ Cost savings - block expensive model calls early
๐ก๏ธ Layered defense - combine multiple guardrails for robust protection
๐ Exception handling - clean error management with try/catch blocks
Guardrails are essential for production AI agents, alongside proper authentication, access controls, and monitoring. They're your first line of defense against misuse and unexpected behavior.
More Resources
Blog: In-depth articles on AI workflows and practical strategies for growth
AI Tool Collection: Discover and compare validated AI solutions
Consultancy: I help you discover AI potential or train your team