Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

PlaybookIA / Agentes

Playbook: From Prompt to Pipeline — the 5 Stages of a Reliable Agent

Oct 5, 2025 10 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

An agent is not born reliable — it is built stage by stage, from a simple prompt to an operable system with observability, guardrails, and audit trails. This playbook maps each stage, what you gain, what you risk, and when to stop climbing. The bottleneck is not the prompt; it is the system around it.

Everyone starts with a prompt. Few end up with an agent that runs in production without blowing the budget, looping forever, or taking actions nobody authorized. The problem is not the model — it is the absence of structure around it.

What you will be able to decide after reading this

Identify which of the 5 stages your current system is at

Know exactly what to add to move up one stage — and what you risk by doing so

Understand why Stage 4 (Verifier) is the watershed between 'looks right' and 'is right'

Recognize the anti-patterns that kill agents in production before you commit them

Map the infrastructure components required at each stage in the AWS/Bedrock context

Base references for this playbook

ReAct pattern origin: Yao et al., 2022 — Princeton / Google Brain
Agent design reference: Anthropic Engineering — Building Effective Agents (2024)
Runtime platform: Amazon Bedrock AgentCore (GA 2025)
Application domain: Agnostic — pattern applicable to any LLM with tool use
Playbook scope: Stages 1–5: from simple prompt to operable pipeline
Estimated cost of skipping Stage 5: Uncontrolled loops can generate hundreds of API calls in minutes (estimate based on public cases)

The mental model that unlocks everything: an agent is a system, not a call

Most teams approach the LLM with a REST API mindset: you send an input, you get an output, done. That works for Stage 1. The problem is that this mindset persists when the system gains tools, loops, and autonomy — and that is when accidents happen.

The conceptual shift is simple: the moment the LLM can act in the world — call an API, write to a database, send an email — you no longer have a language model. You have an agent. And agents are distributed systems with state, side effects, and non-deterministic failures.

That changes everything. A distributed system needs:

Execution limits (stop rules, token budgets, max iterations)
Observability (traces, spans, cost metrics per run)
Privilege control (the agent can only do what it needs to do — not everything it is capable of doing)
External verification (the fact that the LLM said the answer is correct does not mean it is)
Audit trails (who authorized, what was executed, what was the output)

Yao et al.'s ReAct paper (2022) formalized the Observe→Reason→Act loop as the core of an agent. What the paper does not solve — and no paper solves — is what happens when that loop runs in production with real data, real permissions, and real users. That is the engineering work that Stages 3, 4, and 5 cover.

Anthropic, in the effective agents guide, puts it plainly: 'the added complexity of agentic workflows is only worthwhile when the task requires flexibility or multi-step reasoning that fixed workflows cannot deliver.' Translation: do not climb the ladder out of technical vanity. Climb because the problem demands it.

The 5 Stages — what you add, what you gain, what you risk

1
Stage 1 — Simple Prompt
What it is: A single LLM call. Input → Output. No tools, no loop, no memory. What you add: A well-structured prompt (system prompt, examples, clear format instruction). What you gain: Maximum iteration speed. Predictable cost. Zero side effects. Ideal for classification, summarization, entity extraction, draft generation. What you risk: Nothing beyond a bad response. The risk is quality, not system integrity. When to stop here: If the problem fits in one call with enough context, stop here. Seriously. Most enterprise use cases fit this stage with careful prompt engineering. Testability: Direct unit test — fixed input → output evaluated by deterministic criterion or by another LLM as judge.
2
Stage 2 — Prompt + Tools (Tool Use)
What it is: The LLM can call external functions (APIs, searches, calculators, databases). This is where the agent is born. What you add: Tool definitions (JSON schemas), tool executor, tool call error handling. What you gain: The model now acts in the world. Real-time queries, code execution, integration with external systems. What you risk: Privilege escalation. The agent has the permissions of the identity executing the tools. If that identity has write access in production, so does the agent. Least-privilege principle is mandatory here — not optional. Practical rule: Each tool should have a separate IAM role with minimum scope. In Bedrock AgentCore, use resource-based policies per tool. Never pass admin credentials to the tool executor.
3
Stage 3 — Loop (ReAct: Observe → Reason → Act → Repeat)
What it is: The agent iterates. It observes the result of an action, reasons about the next step, acts again. This is the ReAct pattern from Yao et al. (2022). What you add: Execution loop, context management between iterations, stop criterion (stop rule). What you gain: Ability to solve multi-step problems. The agent can correct intermediate errors, explore alternative paths, decompose complex tasks. What you risk: Loss of predictability. Without explicit stop rules, the agent can loop indefinitely. Without a token budget, cost scales non-linearly. Without an iteration limit, a simple task can turn into a 200-API-call race. Mandatory stop rules: 1. max_iterations — maximum number of loop cycles (I recommend starting with 10) 2.
4
Stage 4 — Verifier (External Verification)
What it is: A separate component — which can be another LLM call, a deterministic function, or a rule set — that validates the agent's output before it is delivered or before an irreversible action is executed. What you add: Independent verifier, explicit validation criteria, approval gate for high-impact actions. What you gain: The separation between 'the agent believes it is correct' and 'the output passed verifiable criteria'. This is the watershed for production. An agent without a verifier is a system that trusts itself — and LLMs are notoriously confident even when wrong. What you risk: Additional latency. Additional cost (if the verifier is another LLM). Maintenance complexity for validation criteria.
5
Stage 5 — Pipeline (Operable System)
What it is: The agent becomes a system you operate. It has a trigger (event, schedule, webhook), full observability, guardrails, cost budget, audit trail, and incident runbook. What you add: Trigger mechanism, distributed tracing (spans per iteration and per tool call), cost tracking per run, input/output guardrails, immutable audit log, anomaly alerts, runbook. What you gain: A system you can debug, monitor, bill, audit, and scale. Without this, you have a prototype in production — not a product. What you risk: Significant operational complexity. A poorly instrumented pipeline is worse than no pipeline — you have a false sense of control.

Why Stage 4 is the most underestimated

Teams that reach Stage 3 are usually excited. The loop works. The agent solves multi-step problems. The demos are impressive. And then they jump straight to Stage 5 — instrumentation, pipeline, deploy.

The problem: they skipped Stage 4.

Without an external verifier, the system trusts the LLM's self-assessment. And LLMs are optimistic by design — they were trained to produce coherent and confident responses. A model that says 'I verified and the result is correct' is not equivalent to a system that independently verified.

Anthropic documents this explicitly: in agentic pipelines, errors accumulate. An error at step 3 of a 10-step process can propagate and amplify in subsequent steps. The verifier breaks that propagation.

In practice, the simplest verifier that works is a deterministic function: 'does the output have all required fields? Are the values within expected ranges? Is the JSON valid?' That is not glamorous, but it catches most production errors.

For higher-risk cases — financial actions, external communications, data modifications — the verifier needs to be a human gate or a second model with an adversarial prompt ('find the errors in this response'). The cost of a second LLM call is trivial compared to the cost of an incorrect action in production.

A heuristic I use: if you cannot write the verifier criteria before building the agent, you do not understand the problem well enough to automate it.

Maturity matrix by stage

	Stage	Autonomy	Predictability	Cost per run	Main risk
1 — Prompt	None	High	Fixed and predictable	Output quality	Tools + schemas
2 — Prompt + Tools	Low (1 cycle)	High	Low + tool cost	Privilege escalation	Loop + stop rules
3 — Loop (ReAct)	Medium (multi-step)	Medium	Variable (explosion risk)	Infinite loop, uncontrolled cost	External verifier
4 — Verifier	Medium-high	High (with gates)	Medium + verifier cost	False positive in verifier	Observability + guardrails + audit
5 — Pipeline	High (operable)	High (with alerts)	High (infra + observability)	Operational complexity, false sense of control	— (top of the ladder)

The Ladder: from Prompt to Pipeline

Each stage adds components around the LLM. The model itself does not change — the system around it is what evolves. Read bottom to top: each layer presupposes the previous one.

🟦 Estágio 5 — Pipeline Operável / Operable Pipeline

Trigger · EventBridge / SQS
Orquestrador · Step Functions
Guardrails · Bedrock Guardrails
Audit Log · CloudTrail + S3 Lock
Alertas · CloudWatch Alarms

🟩 Estágio 4 — Verifier / Verifier

Verifier · Determinístico / LLM-judge / Human gate
Approval Gate · Ações irreversíveis

🟨 Estágio 3 — Loop ReAct / ReAct Loop

Loop Controller · max_iter / timeout / budget
Context Manager · Memória entre iterações

🟧 Estágio 2 — Tool Use / Tool Use

Tool Executor · IAM least-privilege
Tools · API / DB / Search / Code

🟥 Estágio 1 — LLM Core / LLM Core

LLM · Bedrock / Claude / etc.
Prompt · System + User + Examples

👤 Usuário / User

Usuário / Sistema · Chamador

What Bedrock AgentCore solves — and what it does not

Amazon Bedrock AgentCore (GA 2025) is AWS's bet to solve the infrastructure of Stages 2 through 5 in a managed way. It is worth understanding what it delivers and where you still need to do the work.

What AgentCore solves:

Session and memory management: persistent context between iterations without you managing DynamoDB underneath
Tool execution runtime: managed executor with retry, timeout, and automatic tool call logging
Bedrock Guardrails integration: native input/output filtering, including PII detection and prohibited topics
Basic observability: execution traces integrated with CloudWatch, with agentId, sessionId, and spans per tool call
Access control: IAM integration to define which tools the agent can call

What AgentCore does NOT solve — and you need to build:

External verifier (Stage 4): AgentCore does not have a native independent verification component. You implement this as a Lambda or Step Functions state before returning output
Approval gates for irreversible actions: requires manual integration with SNS/SES for human notification or Step Functions with .waitForTaskToken
Granular cost tracking per run: AgentCore logs tokens, but aggregating cost per business runId requires custom instrumentation
Business stop rules: max_iterations is configurable, but stop logic based on business criteria (e.g., 'stop if the result is already good enough') is your responsibility
Immutable audit log: CloudTrail captures API calls, but a business audit log with the agent's reasoning requires explicit writing to S3 with Object Lock

The practical conclusion: AgentCore significantly reduces the work to reach Stage 3. It does not replace the engineering of Stages 4 and 5. Use it as an accelerator, not a complete solution.

Anti-patterns that kill agents in production

1. Moving to Stage 3 without stop rules. The loop works in the demo because the demo has 3 iterations. In production, with real data and edge cases, the agent can iterate 50 times before you notice. Define max_iterations, max_tokens_total, and timeout before enabling the loop. 2. Giving the agent the developer's permissions. The tool executor runs with the credentials of whoever configured it. If you tested with your admin IAM role, the agent has admin access. Create a dedicated role with minimum scope before the first deploy. 3. Skipping Stage 4 and going straight to 5. Instrumentation does not replace verification. You can have perfect traces of an agent that is confidently producing incorrect outputs. 4. Using the same agent for read and write. Separate query agents (read-only) from action agents (write). This limits the blast radius of unexpected behavior. 5. Not having an incident runbook. When the agent gets stuck in production at 2am, you need to know how to stop the execution, how to revert actions, and who to notify. Document this before go-live. 6. Trusting the LLM's own 'I verified'. Self-verification in LLMs is notoriously unreliable. The model that generated the output should not be the only one evaluating it.

Rule of thumb

'If you cannot describe the stop criterion and the success criterion before building, you are not ready for the next stage.' Stop criterion = when the agent should stop (stop rules, budget, timeout). Success criterion = how you know the output is correct (verifier criteria). Without both written explicitly, you are building in the dark.

My perspective — what I actually do in practice

Senior Solutions Architect

After working with financial-grade systems where an incorrect output has real consequences, I developed a clear stance: I treat agents as high-risk systems from Stage 2 — not from the moment something goes wrong. In practice, this means: I never move to Stage 3 without stop rules documented and tested. This is not bureaucracy — it is the difference between a $0.50 run and a $500 run. I have seen loops that consumed monthly budgets in hours because someone assumed 'the model will stop when it is done'. I implement the verifier before enabling write actions. Even if it is a simple verifier — JSON schema validation, required field checklist — it needs to exist before any tool that modifies state. The cost of a second LLM call for verification is trivial; the cost of reverting an incorrect action in production is not. For financial systems or sensitive data, Stage 4 mandatorily includes a human-in-the-loop for actions above a threshold. No matter how good the agent is — there are decisions that require a human signature. This is not a technical limitation; it is governance. My criterion for recommending Bedrock AgentCore: if you need to reach Stage 3 quickly and have a small team, AgentCore is worth the lock-in. It solves the plumbing of session, memory, and tool execution. But if you have regulatory audit requirements or need cross-cloud portability, build the runtime yourself on primitives (Lambda + DynamoDB + SQS) — you will have more control over what goes into the audit log. The insight that most changes the conversation with stakeholders: showing the stage ladder as a maturity map, not a feature list. When the CTO sees that 'reliable agent' is Stage 5 and that skipping stages has concrete costs, the conversation about timeline and resources becomes much more honest.

Well-Architected lenses by stage

Security

Stage 2 mandates least-privilege per tool. Stage 5 requires immutable audit log and input/output guardrails. Never share credentials between tools with different scopes.

Reliability

Stop rules (Stage 3) and verifier (Stage 4) are the primary reliability mechanisms. Without them, the system has no way to detect or contain cascading failures.

Performance efficiency

Each stage adds latency. Measure P95 latency per stage. The LLM-judge verifier can add 2-5s — evaluate whether the use case tolerates this.

Sustainability

Unnecessary loops consume energy and generate cost without value. Well-defined stop rules and success criteria reduce wasted iterations.

Conclusion

The 5-stage ladder is not a mandatory linear progression — it is a decision map. Most enterprise problems do not need to reach Stage 5. What every problem needs is for you to know which stage you are at and why. The most expensive mistake is not staying at Stage 1 when the problem demands Stage 3. The most expensive mistake is moving to Stage 3 without Stage 3's stop rules, or to Stage 5 without Stage 4's verifier. You accumulate autonomy without accumulating control — and then the system does not fail predictably; it fails surprisingly. The bottleneck is not the prompt. It never was. The bottleneck is the system around it — and that system you build stage by stage, with intention, or you do not build it at all.

References

Anthropic Engineering — Building Effective Agents Yao et al. — ReAct: Synergizing Reasoning and Acting in Language Models (2022)AWS — Amazon Bedrock AgentCore

#agents#llm#genai#aws-bedrock#react#pipeline#observability#guardrails

Case sources

Anthropic — Building effective agents Yao et al. — ReAct AWS — Amazon Bedrock AgentCore

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

PlaybookIA / Agentes

Playbook: From Prompt to Pipeline — the 5 Stages of a Reliable Agent

Oct 5, 2025 10 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

What you will be able to decide after reading this

Identify which of the 5 stages your current system is at

Know exactly what to add to move up one stage — and what you risk by doing so

Understand why Stage 4 (Verifier) is the watershed between 'looks right' and 'is right'

Recognize the anti-patterns that kill agents in production before you commit them

Map the infrastructure components required at each stage in the AWS/Bedrock context

Base references for this playbook

ReAct pattern origin: Yao et al., 2022 — Princeton / Google Brain
Agent design reference: Anthropic Engineering — Building Effective Agents (2024)
Runtime platform: Amazon Bedrock AgentCore (GA 2025)
Application domain: Agnostic — pattern applicable to any LLM with tool use
Playbook scope: Stages 1–5: from simple prompt to operable pipeline
Estimated cost of skipping Stage 5: Uncontrolled loops can generate hundreds of API calls in minutes (estimate based on public cases)

The mental model that unlocks everything: an agent is a system, not a call

That changes everything. A distributed system needs:

Execution limits (stop rules, token budgets, max iterations)
Observability (traces, spans, cost metrics per run)
Privilege control (the agent can only do what it needs to do — not everything it is capable of doing)
External verification (the fact that the LLM said the answer is correct does not mean it is)
Audit trails (who authorized, what was executed, what was the output)

The 5 Stages — what you add, what you gain, what you risk

1
Stage 1 — Simple Prompt
What it is: A single LLM call. Input → Output. No tools, no loop, no memory. What you add: A well-structured prompt (system prompt, examples, clear format instruction). What you gain: Maximum iteration speed. Predictable cost. Zero side effects. Ideal for classification, summarization, entity extraction, draft generation. What you risk: Nothing beyond a bad response. The risk is quality, not system integrity. When to stop here: If the problem fits in one call with enough context, stop here. Seriously. Most enterprise use cases fit this stage with careful prompt engineering. Testability: Direct unit test — fixed input → output evaluated by deterministic criterion or by another LLM as judge.
2
Stage 2 — Prompt + Tools (Tool Use)
What it is: The LLM can call external functions (APIs, searches, calculators, databases). This is where the agent is born. What you add: Tool definitions (JSON schemas), tool executor, tool call error handling. What you gain: The model now acts in the world. Real-time queries, code execution, integration with external systems. What you risk: Privilege escalation. The agent has the permissions of the identity executing the tools. If that identity has write access in production, so does the agent. Least-privilege principle is mandatory here — not optional. Practical rule: Each tool should have a separate IAM role with minimum scope. In Bedrock AgentCore, use resource-based policies per tool. Never pass admin credentials to the tool executor.
3
Stage 3 — Loop (ReAct: Observe → Reason → Act → Repeat)
What it is: The agent iterates. It observes the result of an action, reasons about the next step, acts again. This is the ReAct pattern from Yao et al. (2022). What you add: Execution loop, context management between iterations, stop criterion (stop rule). What you gain: Ability to solve multi-step problems. The agent can correct intermediate errors, explore alternative paths, decompose complex tasks. What you risk: Loss of predictability. Without explicit stop rules, the agent can loop indefinitely. Without a token budget, cost scales non-linearly. Without an iteration limit, a simple task can turn into a 200-API-call race. Mandatory stop rules: 1. max_iterations — maximum number of loop cycles (I recommend starting with 10) 2.
4
Stage 4 — Verifier (External Verification)
What it is: A separate component — which can be another LLM call, a deterministic function, or a rule set — that validates the agent's output before it is delivered or before an irreversible action is executed. What you add: Independent verifier, explicit validation criteria, approval gate for high-impact actions. What you gain: The separation between 'the agent believes it is correct' and 'the output passed verifiable criteria'. This is the watershed for production. An agent without a verifier is a system that trusts itself — and LLMs are notoriously confident even when wrong. What you risk: Additional latency. Additional cost (if the verifier is another LLM). Maintenance complexity for validation criteria.
5
Stage 5 — Pipeline (Operable System)
What it is: The agent becomes a system you operate. It has a trigger (event, schedule, webhook), full observability, guardrails, cost budget, audit trail, and incident runbook. What you add: Trigger mechanism, distributed tracing (spans per iteration and per tool call), cost tracking per run, input/output guardrails, immutable audit log, anomaly alerts, runbook. What you gain: A system you can debug, monitor, bill, audit, and scale. Without this, you have a prototype in production — not a product. What you risk: Significant operational complexity. A poorly instrumented pipeline is worse than no pipeline — you have a false sense of control.

Why Stage 4 is the most underestimated

The problem: they skipped Stage 4.

A heuristic I use: if you cannot write the verifier criteria before building the agent, you do not understand the problem well enough to automate it.

Maturity matrix by stage

	Stage	Autonomy	Predictability	Cost per run	Main risk
1 — Prompt	None	High	Fixed and predictable	Output quality	Tools + schemas
2 — Prompt + Tools	Low (1 cycle)	High	Low + tool cost	Privilege escalation	Loop + stop rules
3 — Loop (ReAct)	Medium (multi-step)	Medium	Variable (explosion risk)	Infinite loop, uncontrolled cost	External verifier
4 — Verifier	Medium-high	High (with gates)	Medium + verifier cost	False positive in verifier	Observability + guardrails + audit
5 — Pipeline	High (operable)	High (with alerts)	High (infra + observability)	Operational complexity, false sense of control	— (top of the ladder)

The Ladder: from Prompt to Pipeline

Each stage adds components around the LLM. The model itself does not change — the system around it is what evolves. Read bottom to top: each layer presupposes the previous one.

🟦 Estágio 5 — Pipeline Operável / Operable Pipeline

Trigger · EventBridge / SQS
Orquestrador · Step Functions
Guardrails · Bedrock Guardrails
Audit Log · CloudTrail + S3 Lock
Alertas · CloudWatch Alarms

🟩 Estágio 4 — Verifier / Verifier

Verifier · Determinístico / LLM-judge / Human gate
Approval Gate · Ações irreversíveis

🟨 Estágio 3 — Loop ReAct / ReAct Loop

Loop Controller · max_iter / timeout / budget
Context Manager · Memória entre iterações

🟧 Estágio 2 — Tool Use / Tool Use

Tool Executor · IAM least-privilege
Tools · API / DB / Search / Code

🟥 Estágio 1 — LLM Core / LLM Core

LLM · Bedrock / Claude / etc.
Prompt · System + User + Examples

👤 Usuário / User

Usuário / Sistema · Chamador

What Bedrock AgentCore solves — and what it does not

Amazon Bedrock AgentCore (GA 2025) is AWS's bet to solve the infrastructure of Stages 2 through 5 in a managed way. It is worth understanding what it delivers and where you still need to do the work.

What AgentCore solves:

Session and memory management: persistent context between iterations without you managing DynamoDB underneath
Tool execution runtime: managed executor with retry, timeout, and automatic tool call logging
Bedrock Guardrails integration: native input/output filtering, including PII detection and prohibited topics
Basic observability: execution traces integrated with CloudWatch, with agentId, sessionId, and spans per tool call
Access control: IAM integration to define which tools the agent can call

What AgentCore does NOT solve — and you need to build:

External verifier (Stage 4): AgentCore does not have a native independent verification component. You implement this as a Lambda or Step Functions state before returning output
Approval gates for irreversible actions: requires manual integration with SNS/SES for human notification or Step Functions with .waitForTaskToken
Granular cost tracking per run: AgentCore logs tokens, but aggregating cost per business runId requires custom instrumentation
Business stop rules: max_iterations is configurable, but stop logic based on business criteria (e.g., 'stop if the result is already good enough') is your responsibility
Immutable audit log: CloudTrail captures API calls, but a business audit log with the agent's reasoning requires explicit writing to S3 with Object Lock

The practical conclusion: AgentCore significantly reduces the work to reach Stage 3. It does not replace the engineering of Stages 4 and 5. Use it as an accelerator, not a complete solution.

Anti-patterns that kill agents in production

Rule of thumb

My perspective — what I actually do in practice

Senior Solutions Architect

Well-Architected lenses by stage

Security

Stage 2 mandates least-privilege per tool. Stage 5 requires immutable audit log and input/output guardrails. Never share credentials between tools with different scopes.

Reliability

Stop rules (Stage 3) and verifier (Stage 4) are the primary reliability mechanisms. Without them, the system has no way to detect or contain cascading failures.

Performance efficiency

Each stage adds latency. Measure P95 latency per stage. The LLM-judge verifier can add 2-5s — evaluate whether the use case tolerates this.

Sustainability

Unnecessary loops consume energy and generate cost without value. Well-defined stop rules and success criteria reduce wasted iterations.

Conclusion

References

Anthropic Engineering — Building Effective Agents Yao et al. — ReAct: Synergizing Reasoning and Acting in Language Models (2022)AWS — Amazon Bedrock AgentCore

#agents#llm#genai#aws-bedrock#react#pipeline#observability#guardrails

Case sources

Anthropic — Building effective agents Yao et al. — ReAct AWS — Amazon Bedrock AgentCore

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

Listen to study

What you will be able to decide after reading this

Base references for this playbook

The mental model that unlocks everything: an agent is a system, not a call

The 5 Stages — what you add, what you gain, what you risk

Stage 1 — Simple Prompt

Stage 2 — Prompt + Tools (Tool Use)

Stage 3 — Loop (ReAct: Observe → Reason → Act → Repeat)

Stage 4 — Verifier (External Verification)

Stage 5 — Pipeline (Operable System)

Why Stage 4 is the most underestimated

Maturity matrix by stage

The Ladder: from Prompt to Pipeline

What Bedrock AgentCore solves — and what it does not

Anti-patterns that kill agents in production

Rule of thumb

Well-Architected lenses by stage

Security

Reliability

Performance efficiency

Sustainability

Conclusion

References

Ask Fernando about this

Join the conversation

Listen to study

What you will be able to decide after reading this

Base references for this playbook

The mental model that unlocks everything: an agent is a system, not a call

The 5 Stages — what you add, what you gain, what you risk

Stage 1 — Simple Prompt

Stage 2 — Prompt + Tools (Tool Use)

Stage 3 — Loop (ReAct: Observe → Reason → Act → Repeat)

Stage 4 — Verifier (External Verification)

Stage 5 — Pipeline (Operable System)

Why Stage 4 is the most underestimated

Maturity matrix by stage

The Ladder: from Prompt to Pipeline

What Bedrock AgentCore solves — and what it does not

Anti-patterns that kill agents in production

Rule of thumb

Well-Architected lenses by stage

Security

Reliability

Performance efficiency

Sustainability

Conclusion

References

Ask Fernando about this

Join the conversation