Structured Output from LLMs: The Complete Guide¶
Structured output generation has become the cornerstone of reliable LLM applications. Instead of parsing unpredictable text responses, modern developers demand consistent, type-safe data structures that integrate seamlessly with their applications.
This comprehensive guide covers everything you need to know about generating structured outputs from Large Language Models, from basic concepts to advanced implementation patterns across all major providers.
What is Structured Output?¶
Structured output refers to LLM responses that conform to predefined data schemas instead of returning free-form text. Think JSON objects, validated data models, or typed responses that your application can reliably process.
The Problem with Unstructured Text¶
# Traditional approach - fragile and unreliable
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Extract name and age from: John is 25"}]
)
# You get: "The person's name is John and they are 25 years old."
# Now what? String parsing? Regex? Hope and pray?
The Structured Output Solution¶
import instructor
from pydantic import BaseModel
from openai import OpenAI
class Person(BaseModel):
name: str
age: int
client = instructor.from_openai(OpenAI())
person = client.chat.completions.create(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": "Extract: John is 25"}]
)
print(person.name) # "John"
print(person.age) # 25
print(type(person)) # <class '__main__.Person'>
Why Structured Outputs Matter¶
1. Type Safety and Reliability¶
Structured outputs eliminate the guesswork. Your IDE provides autocomplete, your tests can validate schemas, and runtime errors become compile-time catches.
2. Integration with Existing Systems¶
APIs expect JSON. Databases need structured data. UI components require predictable objects. Structured outputs make LLMs compatible with your entire stack.
3. Validation and Error Handling¶
With schemas, you can validate outputs, retry on failures, and ensure data quality automatically.
4. Performance and Latency¶
Many providers optimize structured output generation, leading to faster response times and lower token usage.
Core Technologies Behind Structured Outputs¶
Function Calling / Tool Use¶
The foundation of structured outputs is function calling (also called "tool use" by some providers). Instead of generating free text, the LLM "calls" a predefined function with structured parameters.
# The LLM sees this function signature
def extract_person(name: str, age: int) -> Person:
"""Extract person information from text"""
return Person(name=name, age=age)
# And generates a structured call:
# extract_person(name="John", age=25)
JSON Schema¶
Under the hood, your Pydantic models are converted to JSON Schema that constrains the LLM's output format:
{
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}
Native Structured Output Support¶
Some providers (like OpenAI's GPT-4o) have native structured output modes that guarantee valid JSON conforming to your schema.
Provider-Specific Implementation¶
OpenAI¶
OpenAI offers multiple approaches for structured outputs:
1. Function Calling (Legacy)¶
2. Native Structured Outputs (Recommended)¶
Key Features: - Guaranteed valid JSON - 100% schema adherence - Optimized performance - Native support in GPT-4o and GPT-4o-mini
Anthropic Claude¶
Anthropic uses tool calling for structured outputs:
import instructor
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
response_model=Person,
messages=[{"role": "user", "content": "Extract: Sarah is 30"}]
)
Key Features: - Advanced reasoning capabilities - Large context windows - Strong performance on complex schemas
Google Gemini¶
Google's Gemini supports both REST API and SDK implementations:
import instructor
import google.generativeai as genai
client = instructor.from_gemini(
genai.GenerativeModel("gemini-1.5-flash-latest")
)
Key Features: - Free tier with generous limits - Multimodal capabilities - Fast inference times
Other Providers¶
Instructor supports 20+ providers including: - Cohere: Strong multilingual support - Mistral: European alternative with good performance - Groq: Ultra-fast inference speeds - Ollama: Local model deployment - Together AI: Open source model hosting
For complete provider documentation, see our integrations guide.
Essential Patterns and Best Practices¶
1. Define Clear, Descriptive Models¶
from typing import List, Optional
from enum import Enum
class Priority(str, Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
class Task(BaseModel):
"""A task extracted from text with priority and metadata."""
title: str
description: Optional[str] = None
priority: Priority = Priority.MEDIUM
estimated_hours: Optional[float] = None
tags: List[str] = []
2. Use Validation for Data Quality¶
from pydantic import Field, validator
class Email(BaseModel):
address: str = Field(..., description="Valid email address")
subject: str
confidence: float = Field(..., ge=0.0, le=1.0)
@validator('address')
def validate_email(cls, v):
if '@' not in v:
raise ValueError('Invalid email format')
return v.lower()
3. Handle Lists and Complex Structures¶
class Contact(BaseModel):
name: str
phone: Optional[str] = None
email: Optional[str] = None
class ExtractedContacts(BaseModel):
"""Multiple contacts extracted from a document."""
contacts: List[Contact]
source_confidence: float
extraction_notes: Optional[str] = None
# Extract multiple items at once
contacts = client.chat.completions.create(
model="gpt-4",
response_model=ExtractedContacts,
messages=[{
"role": "user",
"content": "Extract all contacts from this business card: ..."
}]
)
4. Implement Chain of Thought¶
class ReasonedAnalysis(BaseModel):
"""Analysis with explicit reasoning chain."""
reasoning: str = Field(..., description="Step-by-step reasoning")
conclusion: str
confidence: float = Field(..., ge=0.0, le=1.0)
# The model will show its work
analysis = client.chat.completions.create(
model="gpt-4",
response_model=ReasonedAnalysis,
messages=[{
"role": "user",
"content": "Analyze this financial data and explain your reasoning: ..."
}]
)
print(f"Reasoning: {analysis.reasoning}")
print(f"Conclusion: {analysis.conclusion}")
5. Error Handling and Retries¶
import instructor
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
)
def extract_with_retry(text: str) -> Person:
return client.chat.completions.create(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": f"Extract: {text}"}],
max_retries=2
)
Advanced Use Cases¶
Multimodal Structured Extraction¶
Extract structured data from images:
class TableData(BaseModel):
headers: List[str]
rows: List[List[str]]
table_title: Optional[str] = None
# Extract table from image
table = client.chat.completions.create(
model="gpt-4-vision-preview",
response_model=TableData,
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Extract the table data"},
{"type": "image_url", "image_url": {"url": image_url}}
]
}]
)
Streaming Structured Outputs¶
Get partial results as they're generated:
from instructor import Partial
for partial_person in client.chat.completions.create_partial(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": "Extract: John is 25"}]
):
print(f"Name so far: {partial_person.name}")
print(f"Age so far: {partial_person.age}")
Parallel Processing¶
Extract multiple entities simultaneously:
from instructor.dsl import Parallel
class Analysis(BaseModel):
sentiment: str
topics: List[str]
summary: str
parallel_analysis = client.chat.completions.create(
model="gpt-4",
response_model=Parallel[Analysis],
messages=[{"role": "user", "content": "Analyze these reviews: ..."}]
)
Common Pitfalls and Solutions¶
1. Over-Complex Schemas¶
Problem: Deeply nested objects that confuse the model.
Solution: Keep schemas flat and use composition:
# Instead of deeply nested
class BadAddress(BaseModel):
location: dict # Avoid this
# Use clear structure
class Address(BaseModel):
street: str
city: str
country: str
class Person(BaseModel):
name: str
address: Address
2. Missing Field Descriptions¶
Problem: Models fail because the LLM doesn't understand field purpose.
Solution: Add descriptive field documentation:
class BetterTask(BaseModel):
title: str = Field(..., description="Brief, actionable task title")
priority: Priority = Field(
default=Priority.MEDIUM,
description="Task urgency: low, medium, or high"
)
3. Ignoring Validation Errors¶
Problem: Silent failures lead to poor data quality.
Solution: Implement proper error handling:
try:
result = client.chat.completions.create(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": text}]
)
except instructor.ValidationError as e:
print(f"Validation failed: {e}")
# Handle gracefully or retry
Performance Optimization¶
1. Choose the Right Model¶
- GPT-4o: Best balance of speed and accuracy
- GPT-3.5-turbo: Fastest for simple schemas
- Claude-3.5-Sonnet: Best for complex reasoning
- Gemini-1.5-Flash: Free tier, good performance
2. Optimize Schema Design¶
# Efficient schema
class EfficientPerson(BaseModel):
name: str
age: int
# Inefficient schema (too many optional fields)
class InefficientPerson(BaseModel):
full_name: Optional[str]
first_name: Optional[str]
last_name: Optional[str]
age_in_years: Optional[int]
birth_year: Optional[int]
# ... 20 more optional fields
3. Use Appropriate Modes¶
# For guaranteed accuracy
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.JSON)
# For faster responses
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS)
Migration Guide¶
From LangChain¶
# LangChain approach
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
parser = PydanticOutputParser(pydantic_object=Person)
prompt = PromptTemplate(
template="Extract person info: {text}\n{format_instructions}",
input_variables=["text"],
partial_variables={"format_instructions": parser.get_format_instructions()}
)
# Instructor approach (simpler)
person = client.chat.completions.create(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": f"Extract: {text}"}]
)
From Manual JSON Parsing¶
# Manual approach (fragile)
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Return JSON: ..."}]
)
try:
data = json.loads(response.choices[0].message.content)
person = Person(**data) # Hope it works
except (json.JSONDecodeError, ValidationError):
# Handle errors manually
pass
# Instructor approach (reliable)
person = client.chat.completions.create(
model="gpt-4",
response_model=Person,
messages=[{"role": "user", "content": "Extract: ..."}]
)
# Guaranteed to be a valid Person object
Future of Structured Outputs¶
Emerging Trends¶
- Native Provider Support: More providers implementing built-in structured output modes
- Multi-Modal Integration: Structured extraction from video, audio, and documents
- Real-time Streaming: Partial structured outputs for UI updates
- Agent Integration: Structured outputs as the foundation for reliable AI agents
What's Next for Instructor¶
- Enhanced streaming capabilities
- Multi-provider parallel processing
- Advanced validation frameworks
- Visual schema builders
Getting Started Today¶
1. Install Instructor¶
2. Choose Your Provider¶
# OpenAI (recommended)
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
# Anthropic
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())
# Gemini (free tier)
import google.generativeai as genai
client = instructor.from_gemini(genai.GenerativeModel("gemini-1.5-flash"))
3. Define Your First Model¶
from pydantic import BaseModel
class ProductReview(BaseModel):
product_name: str
rating: int # 1-5 stars
review_text: str
would_recommend: bool
4. Extract Structured Data¶
review = client.chat.completions.create(
model="gpt-4",
response_model=ProductReview,
messages=[{
"role": "user",
"content": "Extract review: 'This phone is amazing! 5 stars, definitely recommend'"
}]
)
print(review.product_name) # "phone"
print(review.rating) # 5
print(review.would_recommend) # True
Conclusion¶
Structured outputs represent a fundamental shift in how we build with LLMs. By moving beyond text parsing to type-safe, validated data structures, we can build more reliable, maintainable, and powerful applications.
Whether you're building RAG systems, AI agents, or data extraction pipelines, structured outputs provide the foundation for production-ready LLM applications.
The future of AI development is structured, typed, and reliable. Start building with structured outputs today.
Related Concepts¶
- Models and Response Models - Deep dive into Pydantic model design
- Validation and Error Handling - Comprehensive validation strategies
- Provider Integrations - All supported LLM providers
- Streaming Support - Real-time structured outputs
See Also¶
- Instructor vs LangChain: When to Use What - Detailed comparison guide
- Build Type-Safe AI Apps with Instructor + Pydantic - Practical implementation guide
- 10 Instructor Patterns That Save Hours - Advanced patterns and techniques
If you enjoy this content or want to try out instructor
, please check out the GitHub repository and give us a star!