Pydantic AI brings the same developer experience that made FastAPI the go-to Python web framework to the world of AI agents. Instead of wrestling with raw LLM responses and hoping strings match your expected schema, Pydantic AI gives you full type safety, automatic validation, and structured outputs — backed by the Pydantic library that already validates data in millions of production applications.
In this tutorial, you will build a Content Research Agent that searches the web, extracts key facts, and returns a fully typed research report — all in fewer than 200 lines of Python.
What You'll Build
By the end of this tutorial, you will have:
- A working Pydantic AI agent with tool use and structured output
- Dependency injection for reusable context (HTTP client, database, config)
- Streaming support for real-time responses
- Multi-turn conversation handling
- A production-ready content research agent
Prerequisites
Before you begin, make sure you have:
- Python 3.11 or later installed
- Basic familiarity with Python type hints and async/await
- An API key from OpenAI, Anthropic, or any supported provider
piporuvfor package management
Why Pydantic AI?
Most LLM frameworks treat structured output as an afterthought — you get raw JSON strings and parse them yourself, hoping the model didn't hallucinate extra fields. Pydantic AI solves this at the framework level:
- Type-safe responses: Define a Pydantic model, get a validated instance back. No manual parsing.
- Automatic retry: If the LLM returns invalid JSON or wrong field types, Pydantic AI retries with the validation error fed back to the model.
- Tool-first design: Register Python functions as tools with a single decorator. Arguments are validated automatically.
- Provider-agnostic: Switch between OpenAI, Anthropic, Gemini, Mistral, Ollama, and dozens more by changing a single string.
- Logfire native: First-class observability with zero config if you use Pydantic's Logfire.
Step 1: Install Pydantic AI
Install with your preferred provider. The openai extra includes the OpenAI client:
pip install 'pydantic-ai[openai]'
# or for Anthropic
pip install 'pydantic-ai[anthropic]'
# or for everything
pip install 'pydantic-ai[all]'Using uv (recommended):
uv add 'pydantic-ai[openai]'Set your API key in the environment:
export OPENAI_API_KEY="sk-..."
# or
export ANTHROPIC_API_KEY="sk-ant-..."Step 2: Your First Agent
The Agent class is the core primitive. You declare the model, instructions, and optionally an output_type:
from pydantic_ai import Agent
agent = Agent(
'openai:gpt-4o',
instructions='You are a helpful assistant. Be concise and accurate.',
)
result = agent.run_sync('What is the capital of Tunisia?')
print(result.output) # "The capital of Tunisia is Tunis."For async code (recommended in production), use await agent.run(...):
import asyncio
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', instructions='Be concise.')
async def main():
result = await agent.run('Explain asyncio in one sentence.')
print(result.output)
asyncio.run(main())Switching providers is a one-line change:
# OpenAI
agent = Agent('openai:gpt-4o')
# Anthropic
agent = Agent('anthropic:claude-sonnet-4-6')
# Gemini
agent = Agent('google-gla:gemini-2.5-flash')
# Local Ollama
agent = Agent('ollama:llama3.2')Step 3: Structured Outputs with Pydantic Models
This is where Pydantic AI shines. Pass any Pydantic model as output_type and the agent guarantees a validated instance back:
from pydantic import BaseModel
from pydantic_ai import Agent
class MovieReview(BaseModel):
title: str
year: int
rating: float # 0.0 to 10.0
summary: str
pros: list[str]
cons: list[str]
recommended: bool
agent = Agent(
'openai:gpt-4o',
output_type=MovieReview,
instructions='You are a film critic. Return structured movie reviews.',
)
result = agent.run_sync('Review the movie Inception.')
review = result.output # type: MovieReview — fully validated
print(f"{review.title} ({review.year}) — {review.rating}/10")
print(f"Recommended: {review.recommended}")
print("Pros:", review.pros)If the model returns an invalid response (wrong types, missing fields), Pydantic AI automatically sends the validation error back to the model and requests a corrected response — up to the request_limit you configure.
Union outputs for flexible schemas
You can express "either this or that" output with a Union type:
from pydantic import BaseModel
from pydantic_ai import Agent
from typing import Union
class SuccessResponse(BaseModel):
result: str
confidence: float
class ErrorResponse(BaseModel):
error: str
reason: str
agent = Agent(
'openai:gpt-4o',
output_type=Union[SuccessResponse, ErrorResponse],
)Step 4: Function Tools
Tools let the LLM call Python functions during a conversation. Register them with @agent.tool (with agent context) or @agent.tool_plain (without):
import httpx
from pydantic_ai import Agent, RunContext
agent = Agent(
'openai:gpt-4o',
instructions='You are a weather assistant. Use tools to fetch real data.',
)
@agent.tool_plain
async def get_weather(city: str) -> str:
"""Fetch the current weather for a given city."""
async with httpx.AsyncClient() as client:
response = await client.get(
f'https://wttr.in/{city}?format=3'
)
return response.text
async def main():
result = await agent.run('What is the weather in Tunis right now?')
print(result.output)
asyncio.run(main())The docstring becomes the tool description sent to the LLM. Function parameters are automatically converted to a JSON schema for the model.
Tools with RunContext (dependency injection)
When tools need access to shared resources (database, HTTP client, config), use @agent.tool with RunContext:
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, RunContext
@dataclass
class ResearchDeps:
http_client: httpx.AsyncClient
api_key: str
max_results: int = 5
agent = Agent(
'anthropic:claude-sonnet-4-6',
deps_type=ResearchDeps,
instructions='You are a research assistant with web search capabilities.',
)
@agent.tool
async def search_web(ctx: RunContext[ResearchDeps], query: str) -> list[str]:
"""Search the web and return a list of relevant result snippets."""
response = await ctx.deps.http_client.get(
'https://api.search.example.com/search',
params={'q': query, 'limit': ctx.deps.max_results},
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
)
results = response.json()
return [r['snippet'] for r in results.get('items', [])]
@agent.tool
async def fetch_page(ctx: RunContext[ResearchDeps], url: str) -> str:
"""Fetch the text content of a webpage."""
response = await ctx.deps.http_client.get(url, follow_redirects=True)
# Strip HTML tags in production — simplified here
return response.text[:2000]The RunContext carries your dependencies into every tool call without global state or hidden coupling.
Step 5: Running Agents with Dependencies
Pass dependencies at run time using the deps argument:
async def main():
async with httpx.AsyncClient() as client:
deps = ResearchDeps(
http_client=client,
api_key='your-search-api-key',
max_results=3,
)
result = await agent.run(
'Research the latest developments in quantum computing.',
deps=deps,
)
print(result.output)Step 6: Streaming Responses
For long-running agent tasks, stream the output as it arrives:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', instructions='Write detailed technical content.')
async def stream_demo():
async with agent.run_stream('Explain how TCP/IP works.') as stream:
async for text_chunk in stream.stream_text():
print(text_chunk, end='', flush=True)
print() # newline at end
result = await stream.get_output()
asyncio.run(stream_demo())Streaming structured outputs
You can also stream and validate a structured output incrementally:
from pydantic import BaseModel
class ResearchReport(BaseModel):
topic: str
summary: str
key_findings: list[str]
sources: list[str]
confidence_score: float
agent = Agent(
'openai:gpt-4o',
output_type=ResearchReport,
)
async def stream_structured():
async with agent.run_stream('Research serverless computing trends.') as stream:
# Stream partial text while the model writes
async for chunk in stream.stream_text(delta=True):
print(chunk, end='', flush=True)
# Get the validated structured result at completion
report: ResearchReport = await stream.get_output()
print(f"\nTopic: {report.topic}")
print(f"Confidence: {report.confidence_score}")Step 7: Multi-Turn Conversations
Pydantic AI tracks conversation history automatically. Use message_history to continue a conversation:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessagesTypeAdapter
agent = Agent('openai:gpt-4o', instructions='You are a coding tutor.')
async def multi_turn():
# First turn
result1 = await agent.run('What is a Python decorator?')
print('Turn 1:', result1.output)
# Second turn — passes previous messages as context
result2 = await agent.run(
'Show me a practical example of that.',
message_history=result1.new_messages(),
)
print('Turn 2:', result2.output)
# Third turn
result3 = await agent.run(
'How would I use this in a FastAPI application?',
message_history=result2.all_messages(),
)
print('Turn 3:', result3.output)
asyncio.run(multi_turn())result.new_messages() returns only the new exchange; result.all_messages() returns the full history including the original.
Step 8: Controlling Agent Behaviour
Limit how many model turns and tool calls an agent can make per run:
from pydantic_ai import Agent
from pydantic_ai.usage import UsageLimits
agent = Agent('openai:gpt-4o')
result = await agent.run(
'Research and summarize 5 recent AI papers.',
usage_limits=UsageLimits(
request_limit=10, # max LLM turns
tool_calls_limit=20, # max tool executions
),
)When the model calls multiple tools in a single response, Pydantic AI executes them concurrently using asyncio.create_task. For tools that must run sequentially, mark them:
@agent.tool(sequential=True)
async def write_to_db(ctx: RunContext[MyDeps], data: dict) -> str:
"""Write data to the database — must not run concurrently."""
await ctx.deps.db.insert(data)
return 'written'Step 9: Full Example — Content Research Agent
Putting it all together: a production-ready content research agent that accepts a topic, searches for information, and returns a typed ResearchReport.
import asyncio
import httpx
from dataclasses import dataclass
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
# --- Data models ---
@dataclass
class Deps:
client: httpx.AsyncClient
brave_api_key: str | None = None # optional — falls back to mock
class ArticleSummary(BaseModel):
title: str
url: str
key_points: list[str] = Field(min_length=1, max_length=5)
class ResearchReport(BaseModel):
topic: str
executive_summary: str = Field(min_length=50)
articles: list[ArticleSummary] = Field(min_length=1, max_length=5)
key_trends: list[str] = Field(min_length=3, max_length=10)
recommended_actions: list[str] = Field(min_length=1, max_length=5)
confidence_score: float = Field(ge=0.0, le=1.0)
# --- Agent definition ---
research_agent = Agent(
'anthropic:claude-sonnet-4-6',
deps_type=Deps,
output_type=ResearchReport,
instructions="""
You are an expert research analyst. When given a topic:
1. Use the search tool to find recent information (2-3 searches)
2. Use fetch_article to get full content from the most relevant results
3. Synthesize findings into a structured ResearchReport
Be thorough but concise. Focus on actionable insights.
""",
)
# --- Tools ---
@research_agent.tool
async def search(ctx: RunContext[Deps], query: str) -> list[dict]:
"""Search the web for recent information on a topic."""
if ctx.deps.brave_api_key:
resp = await ctx.deps.client.get(
'https://api.search.brave.com/res/v1/web/search',
headers={'Accept': 'application/json', 'X-Subscription-Token': ctx.deps.brave_api_key},
params={'q': query, 'count': 5, 'freshness': 'pw'},
)
resp.raise_for_status()
data = resp.json()
return [
{'title': r['title'], 'url': r['url'], 'description': r.get('description', '')}
for r in data.get('web', {}).get('results', [])
]
# Mock response for demo purposes
return [
{'title': f'Article about {query}', 'url': f'https://example.com/{query.replace(" ", "-")}', 'description': f'Comprehensive coverage of {query}'},
]
@research_agent.tool
async def fetch_article(ctx: RunContext[Deps], url: str) -> str:
"""Fetch and return the text content of an article."""
try:
resp = await ctx.deps.client.get(url, timeout=10.0, follow_redirects=True)
resp.raise_for_status()
# In production, use a proper HTML parser (beautifulsoup4, trafilatura)
text = resp.text
# Return first 3000 chars to stay within context limits
return text[:3000]
except Exception as e:
return f'Could not fetch article: {e}'
# --- Runner ---
async def research(topic: str) -> ResearchReport:
async with httpx.AsyncClient(timeout=30.0) as client:
deps = Deps(client=client)
result = await research_agent.run(
f'Research this topic in depth: {topic}',
deps=deps,
)
return result.output
async def main():
report = await research('AI agent frameworks in Python 2026')
print(f'Topic: {report.topic}')
print(f'Confidence: {report.confidence_score:.0%}')
print(f'\nExecutive Summary:\n{report.executive_summary}')
print(f'\nKey Trends:')
for trend in report.key_trends:
print(f' - {trend}')
print(f'\nArticles reviewed: {len(report.articles)}')
print(f'\nRecommended Actions:')
for action in report.recommended_actions:
print(f' - {action}')
if __name__ == '__main__':
asyncio.run(main())Step 10: Multi-Model Fallback
For MENA markets where latency or service availability varies, configure a fallback chain using multiple providers:
from pydantic_ai import Agent
from pydantic_ai.models.fallback import FallbackModel
# Try Claude first, fall back to GPT-4o, then Gemini
fallback_model = FallbackModel(
'anthropic:claude-sonnet-4-6',
'openai:gpt-4o',
'google-gla:gemini-2.5-flash',
)
agent = Agent(fallback_model, output_type=ResearchReport)Step 11: Observability with Logfire
Pydantic AI integrates natively with Logfire for production monitoring:
pip install logfire
logfire authimport logfire
from pydantic_ai import Agent
logfire.configure()
logfire.instrument_pydantic_ai() # auto-instruments all agents
agent = Agent('openai:gpt-4o')
result = agent.run_sync('Hello!')
# Every model call, tool execution, and validation is traced automaticallyYou can view traces in the Logfire dashboard, including token usage, latency, tool call sequences, and validation failures.
Testing Your Agent
Pydantic AI provides a TestModel and FunctionModel for unit testing without calling a real LLM:
import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
def test_research_agent_structure():
with research_agent.override(model=TestModel()):
result = research_agent.run_sync(
'Test topic',
deps=Deps(client=None), # TestModel never calls tools
)
assert isinstance(result.output, ResearchReport)For integration tests, use FunctionModel to inject controlled responses:
from pydantic_ai.models.function import FunctionModel, ModelContext
def my_model(messages, info):
return ResearchReport(
topic='Test',
executive_summary='A' * 50,
articles=[],
key_trends=['trend1', 'trend2', 'trend3'],
recommended_actions=['action1'],
confidence_score=0.9,
)
with research_agent.override(model=FunctionModel(my_model)):
result = research_agent.run_sync('test', deps=Deps(client=None))
assert result.output.confidence_score == 0.9Troubleshooting
ValidationError loops: If the model keeps returning invalid data, add stricter instructions or simplify your Pydantic model. The default request_limit is 10 — increase it for complex schemas.
Tool not being called: Make sure the docstring clearly describes when the tool should be used. The LLM decides when to call tools based on the description.
Provider errors: Pydantic AI raises ModelHTTPError for API failures. Wrap agent.run() in a try/except and implement exponential backoff for production use.
Async in sync code: Use agent.run_sync() sparingly — in web frameworks (FastAPI, Django async views), always use await agent.run().
Next Steps
- Explore multi-agent workflows where one agent delegates to specialist sub-agents
- Add Logfire tracing for production observability
- Integrate Pydantic AI with FastAPI for type-safe LLM API endpoints
- Use model-graded evals to test agent quality at scale
- Try streaming with structured output for interactive research tools
Conclusion
Pydantic AI removes the boilerplate that makes LLM applications fragile and hard to maintain. By treating validation, tool use, and provider abstraction as first-class citizens, it lets you focus on what your agent should do — not on parsing strings and praying the LLM followed your schema. Whether you are building a research assistant, a document processor, or a customer support bot, Pydantic AI gives you the confidence to ship type-safe AI agents to production.