Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

The AI Architect Track

Module 2 · From model to application· Lesson 08/22

Structured output: reliable JSON and validation

How to get fixed-format outputs the rest of the system can safely consume.

5 min read

You wired up the model, the response came back — and now you need to parse it. If the model returned free text, you're in improvisation territory: fragile regex, line splits, hope. Structured output exists to eliminate that improvisation and make the model behave like an API that returns predictable, validatable, system-consumable JSON.

Why free text breaks systems

When the model responds in natural language, it's optimizing for human readability — not for parsers. The same information can appear as "$42.00", "42 dollars", or "forty-two dollars" depending on the model's mood, temperature, or version. Fine for a human reading it. A nightmare for a downstream system trying to extract a number.

The problem scales when you chain calls. In lesson 07 we saw tool calling: the model needs to decide which tool to call and with which arguments. If the arguments arrive as free text, the tool doesn't know what to do with them. Schema is the contract between the model and the code that consumes it.

The practical rule is simple: if another part of the system will consume the output programmatically, require structure. If the output is for direct display to the end user, free text may be acceptable. In agent systems, extraction pipelines, classification, routing — you're almost always in the first case.

Structured output pipeline: from prompt to application

Flow showing how a schema enters the request, the model produces constrained JSON, and the application validates before use.

📝 Aplicação — Lado do cliente

Aplicação · código do sistema
JSON Schema · (Pydantic / Zod / dict)
Validação · parse + assert
Fallback · retry / default

🤖 Provedor — Inferência

API do modelo · (Bedrock / OpenAI / etc.)
Decoding constrangido · ou instrução de schema
Resposta JSON · (pode ter erros)

⚙️ Sistema downstream

Ferramenta / DB · / próxima etapa

How to request structured output — and why to validate anyway

There are three main approaches, in increasing order of reliability:

1. Prompt instruction. You write "Respond ONLY with valid JSON following this schema: {...}". Works reasonably well with large models and low temperature, but is fragile — the model may add text before the JSON, use wrong quotes, or omit optional fields.

2. JSON mode / response_format. Providers like OpenAI and Amazon Bedrock (via Converse API) allow passing a parameter that forces the model to emit valid JSON. This eliminates extra text, but doesn't guarantee the JSON follows your specific schema — only that it's parseable JSON.

3. Structured outputs with explicit schema. The most robust approach: you pass the full JSON Schema in the request and the provider uses constrained decoding (or equivalent) to ensure the output respects the structure field by field. OpenAI response_format: {type: "json_schema"}, Anthropic tool-use with schema, Bedrock tool use — all follow this pattern.

Even with option 3, always validate in your code. The model may fill a field with the wrong type, return null where you expected a string, or the provider may have a bug. Use Pydantic (Python), Zod (TypeScript), or equivalent. If validation fails, you have two exits: retry with the error message included in context, or degrade to a safe default value. Never let an unvalidated object enter the downstream system.

Practical example: extracting data from an invoice

1
Define the schema with Pydantic
``python from pydantic import BaseModel from typing import List class InvoiceItem(BaseModel): description: str quantity: int unit_price: float class Invoice(BaseModel): number: str issue_date: str # ISO 8601 supplier: str items: List[InvoiceItem] total: float``
2
Pass the schema in the request and validate the response
```python import json from pydantic import ValidationError def extract_invoice(text: str, client, model_id: str) -> Invoice | None: schema = Invoice.model_json_schema() response = client.converse( modelId=model_id, messages=[{"role": "user", "content": [{"text": text}]}], toolConfig={ "tools": [{ "toolSpec": { "name": "extract_invoice", "description": "Extracts structured fields from an invoice.", "inputSchema": {"json": schema} } }], "toolChoice": {"tool": {"name": "extract_invoice"}} } ) raw = response["output"]["message"]["content"][0]["toolUse"]["input"] try: return Invoice.model_validate(raw)

In practice: tool use as a shortcut for structured output

Senior Solutions Architect

In practice, I use tool calling as the primary structured output mechanism — even when there's no real tool to be called. You define a 'tool' whose sole purpose is to receive the fields you want to extract, force the model to call it with toolChoice, and receive the arguments already parsed. It's the most portable pattern across providers (Bedrock, Anthropic, OpenAI all support it), eliminates extra text, and gives you the schema in the same place where you define the logic. The cost is a bit more boilerplate — worth every line.

The connection to tool calling and the general principle

In lesson 07 you saw that tool calling uses JSON Schema to describe each tool's arguments. Structured output and tool calling are the same mechanism seen from different angles: in both cases, you're passing a schema to the model and expecting it to produce JSON that respects that contract.

The difference is semantic: tool calling implies the model is deciding to invoke an external action; structured output implies you want to extract information in a fixed format. In implementation, many providers use exactly the same endpoint and parameters for both cases — which is why the example above uses toolConfig in Bedrock even for pure extraction.

The unifying principle: schema is the contract, validation is the guarantee. The model is a probabilistic collaborator — it tries to follow the contract, but can make mistakes. Your code is the gatekeeper that decides what enters the system. Never transfer that responsibility to the model.

One important design detail: keep your schemas simple. Schemas with many optional fields, nested anyOf, or recursive structures increase the chance of the model making mistakes. If you need something complex, break it into multiple calls with smaller schemas — it's more reliable and easier to debug.

Key points from this lesson

Free text breaks downstream systems — use structured output whenever another part of the code will consume the response programmatically.

The most reliable approach is to pass the explicit JSON Schema in the request (via structured outputs or tool use), not just instruct in the prompt.

Always validate the output with Pydantic, Zod, or equivalent — the model can get the schema wrong even with constrained decoding.

Validation failure has two exits: retry with the error in context, or safe degradation with a default value. Never propagate an unvalidated object.

Tool calling and structured output use the same schema mechanism — you can use tool use as a shortcut for structured extraction even without a real tool.

Keep schemas simple: fewer optional fields, no excessive nesting. Complex schemas increase the model's error rate.

Approaches for structured output

	Approach	Reliability	Portability	When to use
Prompt instruction	Low	High (any model)	Quick prototyping, models without schema support	—
JSON mode (`response_format`)	Medium (valid JSON, schema not guaranteed)	Medium (OpenAI, some others)	When you only need parseable JSON	—
Explicit schema / Tool use	High (constrained decoding or equivalent)	High (Bedrock, OpenAI, Anthropic)	Production, pipelines, agents — recommended default	—

Frequently asked questions

Can the model refuse to generate JSON if the content is sensitive?

Yes. Guardrails and content filters act before constrained decoding — if the model refuses the response, you'll receive an error or empty response, not malformed JSON. Treat this as a separate case in your fallback logic. We'll cover guardrails in detail in lesson 10.

Should I include JSON examples in the prompt even when using an explicit schema?

Sometimes it helps, especially for fields with non-obvious semantics (e.g., expected date format, units of measure). But it doesn't replace the schema — examples guide content, schema guarantees structure. Use both when the domain is ambiguous.

How many times should I retry before giving up?

In most cases, 1-2 retries with the validation error in context are sufficient. If after 2 attempts the model still gets the schema wrong, the problem is likely the schema itself (too complex) or the chosen model (too weak for the task). Don't retry infinitely — set a limit and degrade safely.

Quiz

Quick check

1. Why validate the model's structured output?

References

Amazon Bedrock — Converse API with tool use OpenAI — Structured outputs Pydantic — Data validation with Python Zod — TypeScript-first schema validation JSON Schema — Specification

Previous Next lesson

Why free text breaks systems

Structured output pipeline: from prompt to application

Flow showing how a schema enters the request, the model produces constrained JSON, and the application validates before use.

📝 Aplicação — Lado do cliente

Aplicação · código do sistema
JSON Schema · (Pydantic / Zod / dict)
Validação · parse + assert
Fallback · retry / default

🤖 Provedor — Inferência

API do modelo · (Bedrock / OpenAI / etc.)
Decoding constrangido · ou instrução de schema
Resposta JSON · (pode ter erros)

⚙️ Sistema downstream

Ferramenta / DB · / próxima etapa

How to request structured output — and why to validate anyway

There are three main approaches, in increasing order of reliability:

Practical example: extracting data from an invoice

Define the schema with Pydantic

``python from pydantic import BaseModel from typing import List class InvoiceItem(BaseModel): description: str quantity: int unit_price: float class Invoice(BaseModel): number: str issue_date: str # ISO 8601 supplier: str items: List[InvoiceItem] total: float``

Pass the schema in the request and validate the response

```python import json from pydantic import ValidationError def extract_invoice(text: str, client, model_id: str) -> Invoice | None: schema = Invoice.model_json_schema() response = client.converse( modelId=model_id, messages=[{"role": "user", "content": [{"text": text}]}], toolConfig={ "tools": [{ "toolSpec": { "name": "extract_invoice", "description": "Extracts structured fields from an invoice.", "inputSchema": {"json": schema} } }], "toolChoice": {"tool": {"name": "extract_invoice"}} } ) raw = response["output"]["message"]["content"][0]["toolUse"]["input"] try: return Invoice.model_validate(raw)

The connection to tool calling and the general principle

Key points from this lesson

Free text breaks downstream systems — use structured output whenever another part of the code will consume the response programmatically.

The most reliable approach is to pass the explicit JSON Schema in the request (via structured outputs or tool use), not just instruct in the prompt.

Always validate the output with Pydantic, Zod, or equivalent — the model can get the schema wrong even with constrained decoding.

Validation failure has two exits: retry with the error in context, or safe degradation with a default value. Never propagate an unvalidated object.

Tool calling and structured output use the same schema mechanism — you can use tool use as a shortcut for structured extraction even without a real tool.

Keep schemas simple: fewer optional fields, no excessive nesting. Complex schemas increase the model's error rate.

Approaches for structured output

	Approach	Reliability	Portability	When to use
Prompt instruction	Low	High (any model)	Quick prototyping, models without schema support	—
JSON mode (`response_format`)	Medium (valid JSON, schema not guaranteed)	Medium (OpenAI, some others)	When you only need parseable JSON	—
Explicit schema / Tool use	High (constrained decoding or equivalent)	High (Bedrock, OpenAI, Anthropic)	Production, pipelines, agents — recommended default	—

Frequently asked questions

Can the model refuse to generate JSON if the content is sensitive?

Should I include JSON examples in the prompt even when using an explicit schema?

How many times should I retry before giving up?