Structured output: reliable JSON and validation
How to get fixed-format outputs the rest of the system can safely consume.
5 min read
You wired up the model, the response came back — and now you need to parse it. If the model returned free text, you're in improvisation territory: fragile regex, line splits, hope. Structured output exists to eliminate that improvisation and make the model behave like an API that returns predictable, validatable, system-consumable JSON.
Why free text breaks systems
When the model responds in natural language, it's optimizing for human readability — not for parsers. The same information can appear as "$42.00", "42 dollars", or "forty-two dollars" depending on the model's mood, temperature, or version. Fine for a human reading it. A nightmare for a downstream system trying to extract a number.
The problem scales when you chain calls. In lesson 07 we saw tool calling: the model needs to decide which tool to call and with which arguments. If the arguments arrive as free text, the tool doesn't know what to do with them. Schema is the contract between the model and the code that consumes it.
The practical rule is simple: if another part of the system will consume the output programmatically, require structure. If the output is for direct display to the end user, free text may be acceptable. In agent systems, extraction pipelines, classification, routing — you're almost always in the first case.
Structured output pipeline: from prompt to application
Flow showing how a schema enters the request, the model produces constrained JSON, and the application validates before use.
- Aplicação · código do sistema
- JSON Schema · (Pydantic / Zod / dict)
- Validação · parse + assert
- Fallback · retry / default
- API do modelo · (Bedrock / OpenAI / etc.)
- Decoding constrangido · ou instrução de schema
- Resposta JSON · (pode ter erros)
- Ferramenta / DB · / próxima etapa
How to request structured output — and why to validate anyway
There are three main approaches, in increasing order of reliability:
1. Prompt instruction. You write "Respond ONLY with valid JSON following this schema: {...}". Works reasonably well with large models and low temperature, but is fragile — the model may add text before the JSON, use wrong quotes, or omit optional fields.
2. JSON mode / response_format. Providers like OpenAI and Amazon Bedrock (via Converse API) allow passing a parameter that forces the model to emit valid JSON. This eliminates extra text, but doesn't guarantee the JSON follows your specific schema — only that it's parseable JSON.
3. Structured outputs with explicit schema. The most robust approach: you pass the full JSON Schema in the request and the provider uses constrained decoding (or equivalent) to ensure the output respects the structure field by field. OpenAI response_format: {type: "json_schema"}, Anthropic tool-use with schema, Bedrock tool use — all follow this pattern.
Even with option 3, always validate in your code. The model may fill a field with the wrong type, return null where you expected a string, or the provider may have a bug. Use Pydantic (Python), Zod (TypeScript), or equivalent. If validation fails, you have two exits: retry with the error message included in context, or degrade to a safe default value. Never let an unvalidated object enter the downstream system.
Practical example: extracting data from an invoice
- 1
Define the schema with Pydantic
``
python from pydantic import BaseModel from typing import List class InvoiceItem(BaseModel): description: str quantity: int unit_price: float class Invoice(BaseModel): number: str issue_date: str # ISO 8601 supplier: str items: List[InvoiceItem] total: float`` - 2
Pass the schema in the request and validate the response
```python import json from pydantic import ValidationError def extract_invoice(text: str, client, model_id: str) -> Invoice | None: schema = Invoice.model_json_schema() response = client.converse( modelId=model_id, messages=[{"role": "user", "content": [{"text": text}]}], toolConfig={ "tools": [{ "toolSpec": { "name": "extract_invoice", "description": "Extracts structured fields from an invoice.", "inputSchema": {"json": schema} } }], "toolChoice": {"tool": {"name": "extract_invoice"}} } ) raw = response["output"]["message"]["content"][0]["toolUse"]["input"] try: return Invoice.model_validate(raw)
In practice, I use tool calling as the primary structured output mechanism — even when there's no real tool to be called. You define a 'tool' whose sole purpose is to receive the fields you want to extract, force the model to call it with toolChoice, and receive the arguments already parsed. It's the most portable pattern across providers (Bedrock, Anthropic, OpenAI all support it), eliminates extra text, and gives you the schema in the same place where you define the logic. The cost is a bit more boilerplate — worth every line.
The connection to tool calling and the general principle
In lesson 07 you saw that tool calling uses JSON Schema to describe each tool's arguments. Structured output and tool calling are the same mechanism seen from different angles: in both cases, you're passing a schema to the model and expecting it to produce JSON that respects that contract.
The difference is semantic: tool calling implies the model is deciding to invoke an external action; structured output implies you want to extract information in a fixed format. In implementation, many providers use exactly the same endpoint and parameters for both cases — which is why the example above uses toolConfig in Bedrock even for pure extraction.
The unifying principle: schema is the contract, validation is the guarantee. The model is a probabilistic collaborator — it tries to follow the contract, but can make mistakes. Your code is the gatekeeper that decides what enters the system. Never transfer that responsibility to the model.
One important design detail: keep your schemas simple. Schemas with many optional fields, nested anyOf, or recursive structures increase the chance of the model making mistakes. If you need something complex, break it into multiple calls with smaller schemas — it's more reliable and easier to debug.
Key points from this lesson
Approaches for structured output
| Approach | Reliability | Portability | When to use | |
|---|---|---|---|---|
| Prompt instruction | Low | High (any model) | Quick prototyping, models without schema support | — |
| JSON mode (`response_format`) | Medium (valid JSON, schema not guaranteed) | Medium (OpenAI, some others) | When you only need parseable JSON | — |
| Explicit schema / Tool use | High (constrained decoding or equivalent) | High (Bedrock, OpenAI, Anthropic) | Production, pipelines, agents — recommended default | — |
Frequently asked questions
Can the model refuse to generate JSON if the content is sensitive?
Yes. Guardrails and content filters act before constrained decoding — if the model refuses the response, you'll receive an error or empty response, not malformed JSON. Treat this as a separate case in your fallback logic. We'll cover guardrails in detail in lesson 10.
Should I include JSON examples in the prompt even when using an explicit schema?
Sometimes it helps, especially for fields with non-obvious semantics (e.g., expected date format, units of measure). But it doesn't replace the schema — examples guide content, schema guarantees structure. Use both when the domain is ambiguous.
How many times should I retry before giving up?
In most cases, 1-2 retries with the validation error in context are sufficient. If after 2 attempts the model still gets the schema wrong, the problem is likely the schema itself (too complex) or the chosen model (too weak for the task). Don't retry infinitely — set a limit and degrade safely.
Quick check
1. Why validate the model's structured output?