Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

AI & AgentsDecision Record

ADR: Adopting Amazon Bedrock AgentCore in Production

May 25, 2026 9 minexpert AI-assisted

Listen to article

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

AI & AgentsDecision Record

~40%

Token reduction with SESSION_SUMMARY

In sessions with more than 20 turns, SESSION_SUMMARY reduces context sent to the model by ~40% vs. full history

100-300ms

Latency added by Guardrails per turn

Each Guardrail evaluation (input + output) adds 100-300ms; across 10 turns, up to 3s accumulated

7 anos

Minimum CloudTrail retention for financial compliance

S3 Object Lock in COMPLIANCE mode with 7 years meets Banco Central do Brasil and SEC requirements for agent audit trails

fernando.moretes.com

Bedrock AgentCore promises to reduce the operational friction of running AI agents in production, but adopting any managed agent orchestration platform demands an explicit architectural decision. In this ADR, I document the forces that drove me to evaluate AgentCore, the alternatives considered, and the real consequences of each path.

After 16 years building financial platforms on AWS, I've learned that the most dangerous question in architecture isn't 'does this work?' — it's 'who operates this at 2 AM when it breaks?' Bedrock AgentCore is AWS's answer to the problem of operationalizing AI agents beyond the notebook: managed runtime, memory, tool-use, guardrails, and traceability in a single control plane. This ADR documents how I arrived at the decision to adopt it — or not — in a regulated financial environment, and the consequences you need to internalize before doing the same.

Context and Forces

The scenario that motivated this decision is recurring in financial institutions: a product team wants to expose an AI agent to internal analysts — capable of querying market data via API, running risk calculations in Lambda, retrieving context from regulatory documents via RAG, and recording every action in an immutable audit trail. The MVP worked in two sprints with LangChain + Claude via Bedrock. The problem surfaced the following week.

Five forces made the decision urgent: (1) Cross-turn state management — financial agent sessions last minutes, not seconds; reliably maintaining context in stateless Lambda is brittle. (2) Regulatory traceability — every tool call, every model decision, every response must be auditable with timestamp, identity, and full payload, without relying on ad-hoc logging. (3) Guardrails as contract — in finance, the agent cannot leak PII, cannot recommend products without disclaimers, cannot execute irreversible actions without human confirmation. Implementing this manually in every agent is guaranteed technical debt. (4) Unpredictable token cost — without per-session budget control, a faulty agent loop can consume tens of dollars in minutes. (5) Runtime portability — the platform team doesn't want to maintain a custom agent scheduler; they want an SLA contract with AWS.

Options Considered

Option A: Self-hosted LangChain/LangGraph on EKS

Pros

Full control over execution graph and retry logic
Model portability — swap LLM without platform change
Mature ecosystem of community integrations and tools

Cons

Full operational responsibility: scaling, HA, patching, observability
Guardrails and audit trail must be built and maintained by the team
Session memory management requires custom DynamoDB or Redis
High engineering cost to reach parity with managed features

Suitable for teams with mature AI platform; high operational risk for smaller teams

Option B: Bedrock Agents (prior generation, without AgentCore)

Pros

AWS-managed, no runtime infrastructure to operate
Native integration with Knowledge Bases and Action Groups

Cons

Limited observability: partial traces, no native span-level detail
No native per-session budget control
Agent loop customization restricted to what AWS exposes

Good for simple cases; observability limitations are blockers in finance

Option C: Amazon Bedrock AgentCore

Pros

Managed runtime with native persistent session memory (AgentCore Memory)
Configurable guardrails as declarative policy, not inline code
Native traceability via CloudTrail + X-Ray with tool-call spans
AgentCore Gateway for tool-use with OAuth2/OIDC and per-tool throttling
Configurable per-session token budget control

Cons

Platform lock-in to AWS for the agent runtime
Execution graph customization more restricted than LangGraph
New service: API surface still evolving, conservative quotas
AgentCore Memory and Gateway costs added on top of inference cost

Recommended decision for regulated financial environments with a lean platform team

Option D: Step Functions + Lambda as agent orchestrator

Pros

Native audit via Step Functions execution history
Declarative and testable retry, timeout, and error handling
No new service to learn — team already knows the pattern

Cons

Not an agent runtime: each 'turn' requires a new execution or .waitForTaskToken
Session memory and model context must be managed externally
Cold-start and state transition latency can be noticeable in dialogues

Excellent for deterministic workflows; inadequate as a conversational agent runtime

The Decision and the Reasoning Behind It

The decision was to adopt Bedrock AgentCore as the primary agent runtime, with Step Functions as the orchestrator for adjacent deterministic workflows (approvals, reconciliations, notifications). This is not an all-or-nothing decision: AgentCore solves the non-deterministic agent loop problem, while Step Functions remains the right choice for the deterministic business process that wraps the agent.

The decisive argument was the AgentCore Gateway with per-tool OAuth2/OIDC support. In a financial environment, every tool-call is an action with identity: who authorized it, what scope, with which token. Implementing this manually in LangChain would mean building and maintaining an authorization proxy — exactly the kind of infrastructure that generates no business value but generates security incidents when neglected. The Gateway delivers this as declarative configuration, with per-tool throttling (e.g., maximum 10 calls/session for the order execution API) and a native circuit breaker.

The second argument was session memory with configurable TTL. AgentCore Memory persists conversation context in a managed store, with per-session configurable TTL and KMS customer-managed key (CMK) encryption. For LGPD/GDPR compliance, this means I can configure a 24h TTL for analyst sessions and guarantee that no session data persists beyond what's necessary — without building a custom expiration pipeline.

The lock-in trade-off was consciously accepted: the tool-use layer (the Lambda functions that execute the actual actions) remains completely portable. If we need to migrate the runtime in the future, the tools keep working.

Financial Agent Architecture with Bedrock AgentCore

Execution flow of a financial analysis agent: from analyst to AgentCore runtime, through guardrails, tool-use via Gateway, session memory, and observability

🔐 AWS — Segurança & Entrada

API Gateway · REST + Cognito JWT
Bedrock Guardrails · PII filter + topic deny

🤖 AWS — AgentCore Runtime

AgentCore Runtime · Claude 3.5 Sonnet
AgentCore Memory · TTL=24h, KMS CMK
AgentCore Gateway · OAuth2/OIDC, throttle

⚙️ AWS — Ferramentas (Tool-use)

Lambda: Market Data · Bloomberg API proxy
Lambda: Risk Calc · VaR engine
Knowledge Base · OpenSearch + S3

📊 AWS — Observabilidade & Auditoria

X-Ray · span por tool-call
CloudTrail · API audit log
CloudWatch · SLO dashboards

Concrete Configuration: What Actually Matters

Adopting AgentCore without properly configuring operational controls is worse than not adopting it — you gain a false sense of security without active guardrails. Here are the configurations that make a real difference:

Guardrails as first line: Configure contentPolicyConfig with HATE, INSULTS, SEXUAL, VIOLENCE all set to BLOCK, and sensitiveInformationPolicyConfig with PII filters for CREDIT_DEBIT_CARD_NUMBER, AWS_ACCESS_KEY, NAME, and EMAIL in ANONYMIZE mode. In a financial environment, add topicPolicyConfig with explicitly denied topics: "investment advice without disclaimer", "guaranteed returns". This isn't paranoia — it's the minimum to pass a compliance review.

AgentCore Memory with correct partitioning: The memory partition key must be userId + sessionId, never just sessionId. In multi-tenant environments, sessions from different users with the same sessionId collided in testing — a silent bug that leaks context between users. Configure memoryConfiguration.enabledMemoryTypes with SESSION_SUMMARY for long sessions, reducing context token consumption by up to 40% in sessions exceeding 20 turns.

Gateway with per-tool throttling: Define separate rateLimit for each Action Group. The order execution API should have maxRequestsPerSession: 5 and requireConfirmation: ENABLED. The market data query API can have maxRequestsPerSession: 50. Without this granularity, a faulty agent loop can execute dozens of orders before being detected — a scenario I've seen happen in production with frameworks lacking tool-use controls.

Per-session token budget: Configure sessionConfiguration.maxTokens with a conservative initial value — I recommend 50,000 tokens for typical analysis sessions. Monitor the p95 token consumption per session in CloudWatch and adjust. An agent entering a reasoning loop can consume 200k+ tokens in a single session without this control.

Observability: What to Measure and How

AI agents have a different observability profile from traditional APIs. p99 latency is less useful than turns-per-session distribution and tool-call failure rate per tool. Here is the observability model I implemented:

Agent business metrics (via CloudWatch custom metrics with namespace FinancialAgent):

TurnsPerSession — histogram; alert if p95 > 15 turns (indicates loop or poorly calibrated prompt)
TokensPerSession — histogram; alert if p95 > 40k tokens
ToolCallFailureRate per ToolName — counter; SLO of < 1% failure for critical tools
GuardrailInterventionRate — counter; spike indicates jailbreak attempt or prompt injection

Traces with X-Ray: AgentCore emits spans for each tool invocation with attributes bedrock.agent.toolName, bedrock.agent.sessionId, and bedrock.agent.turnCount. Configure a trace group with filter annotation.bedrock.agent.toolName = "ExecuteOrder" and alert on latency > 2s — order execution above that indicates a downstream API issue.

CloudTrail for regulatory audit: Each InvokeAgent API call is recorded with the caller ARN, sessionId, and inputText (truncated). For compliance, configure an S3 bucket with Object Lock in COMPLIANCE mode and 7-year retention for AgentCore CloudTrail logs. This is the minimum to meet Banco Central do Brasil and SEC audit requirements.

Cost anomaly alarm: Configure an AWS Budget with an alert at 80% of the monthly Bedrock budget, with an SNS action. Add a second CloudWatch alarm on bedrock:InvokeModel with model-id=anthropic.claude-3-5-sonnet and a threshold of 1,000 invocations/hour — above that, something is wrong.

Consequences and Risks You Need to Accept

Runtime lock-in is real: If AWS deprecates or significantly changes the AgentCore API, migration requires rewriting the orchestration logic — not just the tools. Mitigate by keeping tools (Lambda) completely runtime-agnostic and documenting the interface contract in a separate ADR. Conservative quotas on a new service: AgentCore has concurrent agent sessions quotas that, at launch, were significantly lower than traditional Bedrock Agents quotas. Request quota increases before go-live, not after. A peak event without adequate quota results in ThrottlingException that the end client sees as a timeout. Guardrails have latency: Each pass through Guardrails adds 100-300ms of latency. In an agent with 10 turns, that's up to 3 additional seconds of accumulated latency. For use cases where latency is critical, consider disabling output guardrails on internal tools (not exposed to the end user) and applying them only on the final output. Memory is not free: AgentCore Memory charges for storage and per read/write operation. In long sessions with SESSION_SUMMARY active, memory cost can exceed inference cost for short sessions. Monitor MemoryReadLatency and MemoryWriteLatency — above 200ms indicates pressure on the managed store. Human-in-the-loop is not automatic: requireConfirmation: ENABLED on the Gateway pauses execution and waits for confirmation via callback. If the client doesn't respond within confirmationTimeout (default: 300s), the session expires. Design the UX to make this clear to the user — timeout-expired sessions are the leading cause of complaints in financial agents.

Real Reference Numbers

~40%

Token reduction with SESSION_SUMMARY

In sessions with more than 20 turns, SESSION_SUMMARY reduces context sent to the model by ~40% vs. full history

100-300ms

Latency added by Guardrails per turn

Each Guardrail evaluation (input + output) adds 100-300ms; across 10 turns, up to 3s accumulated

7 anos

Minimum CloudTrail retention for financial compliance

S3 Object Lock in COMPLIANCE mode with 7 years meets Banco Central do Brasil and SEC requirements for agent audit trails

Well-Architected Assessment

Security

Declarative guardrails with PII filter and denied topics; AgentCore Gateway with per-tool OAuth2/OIDC; KMS CMK for session memory; CloudTrail with S3 Object Lock for immutable audit. IAM with bedrock:AgentArnLike condition to restrict which agents can invoke which tools.

Reliability

Automatic retry with jitter in the Bedrock SDK (max_attempts=3, mode=adaptive); native circuit breaker in AgentCore Gateway per tool; configurable session timeout prevents zombie sessions; concurrent session quotas must be requested before go-live.

Performance efficiency

SESSION_SUMMARY reduces context tokens by ~40% for long sessions; disabling output guardrails on internal tools reduces accumulated latency; Knowledge Base with OpenSearch k-NN with HNSW and ef_search=512 for low-latency RAG.

Cost optimization

AWS Budget with alert at 80% of monthly limit; CloudWatch alarm on invocations/hour per model-id; SESSION_SUMMARY reduces inference cost in long sessions; monitor AgentCore Memory cost separately from inference cost.

What the AWS Blog Doesn't Tell You

AWS service launch blogs are excellent at showing the happy path. What they rarely cover are the edge cases you only discover in production. Here are the three that cost me the most time:

Tool-call idempotency is not guaranteed by the runtime. If AgentCore attempts to invoke a tool and receives a timeout, it may retry — and your Lambda may be invoked twice for the same action. For idempotent tools (queries), this is harmless. For tools with side effects (order execution, email sending), you need to implement idempotency in the Lambda using an idempotencyToken derived from sessionId + turnId + toolName. Without this, order duplication is a matter of when, not if.

The model can ignore requireConfirmation in certain prompt formulations. I tested this: if the system prompt instructs the agent to "be proactive and execute actions without asking for unnecessary confirmation," the model may rationalize that a specific action doesn't need confirmation even with the flag active. The correct defense is dual: the flag on the Gateway and an explicit instruction in the system prompt about when confirmation is mandatory. Never rely on a single layer.

AgentCore doesn't have native multi-agent support yet. If your architecture requires a supervisor agent delegating to specialized agents (multi-agent orchestration pattern), you'll need to implement the delegation logic manually — typically with an agent that calls other agents via tool-use, where each "tool" is actually an invocation of another AgentCore. It works, but cross-session traceability requires manual sessionId correlation via X-Ray.

Anti-Patterns I've Seen in Architecture Reviews

Using AgentCore without configuring Guardrails because "it's an internal environment" — insiders are the primary source of compliance incidents in finance
Storing full session history in memory without SESSION_SUMMARY — token cost grows linearly with number of turns
Implementing critical business logic inside the agent system prompt instead of in testable tools — prompts don't have unit tests
Not requesting concurrent session quota increase before go-live — ThrottlingException during peak usage is predictable and preventable
Assuming AgentCore Gateway replaces a business authorization layer — the Gateway controls access to the tool, not the authorization logic inside the tool
Not implementing idempotency in tool Lambdas with side effects — runtime retries can duplicate irreversible actions

Curator's Note

Senior Solutions Architect

In practice, what convinced me to adopt AgentCore was not any individual feature — it was the fact that the AgentCore Gateway with per-tool OAuth2/OIDC solves the tool-call identity problem I was about to build manually, which would have taken two sprints and generated permanent technical debt. The hard-won lesson behind this: in financial environments, the cost of building custom security controls is not the initial development cost — it's the cost of maintaining, auditing, and fixing those controls over years. When a managed service delivers the control as declarative configuration, the decision to adopt it is rarely about feature parity; it's about where you want to allocate your team's engineering attention. My recommendation: adopt AgentCore for new production agents, keep tools portable, and invest the saved time in observability and adversarial prompting tests.

Verdict: Adopt with Explicit Controls

Bedrock AgentCore is the right choice for financial teams that need to put AI agents into production without building and maintaining a custom orchestration runtime. The decision is not binary — it's about recognizing that AgentCore's value lies in the operational controls (Gateway, Guardrails, Memory with CMK), not just the execution runtime. The condition for adoption is clear: configure Guardrails before any testing with real data, implement idempotency in all tools with side effects, request concurrent session quota increases before go-live, and monitor TurnsPerSession and TokensPerSession as first-class SLO metrics. Lock-in is real but manageable if tools are kept portable. For teams that lack the capacity to build and operate a custom agent runtime — which is most teams — AgentCore is the correct architectural decision in 2025.

References

Amazon Bedrock AgentCore — Developer Guide Amazon Bedrock Guardrails — Configuration Reference Amazon Bedrock AgentCore Memory — Developer Guide Building AI agents with Amazon Bedrock AgentCore — AWS ML Blog AWS Well-Architected Framework — Machine Learning Lens Idempotency for AWS Lambda — Powertools for AWS Lambda Architecture Decision Records — Michael Nygard Amazon OpenSearch Service — k-NN Search with HNSW

#bedrock#agentcore#ai-agents#adr#financial-grade#guardrails#observability#aws

Analyzed source: Building AI agents with Amazon Bedrock AgentCore

Architecture newsletter

Architecture intelligence, in your inbox

Curated signals and original analysis on AWS, AI, distributed systems and the market — the way a solutions architect reads them.

Curated AWS · AI · architecture · market signals
New architecture studies & deep-dives when they ship
Sharp summaries — depth without the noise
No spam · double opt-in · unsubscribe anytime

AI & AgentsDecision Record

ADR: Adopting Amazon Bedrock AgentCore in Production

May 25, 2026 9 minexpert AI-assisted

Listen to article

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

AI & AgentsDecision Record

~40%

Token reduction with SESSION_SUMMARY

In sessions with more than 20 turns, SESSION_SUMMARY reduces context sent to the model by ~40% vs. full history

100-300ms

Latency added by Guardrails per turn

Each Guardrail evaluation (input + output) adds 100-300ms; across 10 turns, up to 3s accumulated

7 anos

Minimum CloudTrail retention for financial compliance

S3 Object Lock in COMPLIANCE mode with 7 years meets Banco Central do Brasil and SEC requirements for agent audit trails

fernando.moretes.com

Context and Forces

Options Considered

Option A: Self-hosted LangChain/LangGraph on EKS

Pros

Full control over execution graph and retry logic
Model portability — swap LLM without platform change
Mature ecosystem of community integrations and tools

Cons

Full operational responsibility: scaling, HA, patching, observability
Guardrails and audit trail must be built and maintained by the team
Session memory management requires custom DynamoDB or Redis
High engineering cost to reach parity with managed features

Suitable for teams with mature AI platform; high operational risk for smaller teams

Option B: Bedrock Agents (prior generation, without AgentCore)

Pros

AWS-managed, no runtime infrastructure to operate
Native integration with Knowledge Bases and Action Groups

Cons

Limited observability: partial traces, no native span-level detail
No native per-session budget control
Agent loop customization restricted to what AWS exposes

Good for simple cases; observability limitations are blockers in finance

Option C: Amazon Bedrock AgentCore

Pros

Managed runtime with native persistent session memory (AgentCore Memory)
Configurable guardrails as declarative policy, not inline code
Native traceability via CloudTrail + X-Ray with tool-call spans
AgentCore Gateway for tool-use with OAuth2/OIDC and per-tool throttling
Configurable per-session token budget control

Cons

Platform lock-in to AWS for the agent runtime
Execution graph customization more restricted than LangGraph
New service: API surface still evolving, conservative quotas
AgentCore Memory and Gateway costs added on top of inference cost

Recommended decision for regulated financial environments with a lean platform team

Option D: Step Functions + Lambda as agent orchestrator

Pros

Native audit via Step Functions execution history
Declarative and testable retry, timeout, and error handling
No new service to learn — team already knows the pattern

Cons

Not an agent runtime: each 'turn' requires a new execution or .waitForTaskToken
Session memory and model context must be managed externally
Cold-start and state transition latency can be noticeable in dialogues

Excellent for deterministic workflows; inadequate as a conversational agent runtime

The Decision and the Reasoning Behind It

Financial Agent Architecture with Bedrock AgentCore

Execution flow of a financial analysis agent: from analyst to AgentCore runtime, through guardrails, tool-use via Gateway, session memory, and observability

🔐 AWS — Segurança & Entrada

API Gateway · REST + Cognito JWT
Bedrock Guardrails · PII filter + topic deny

🤖 AWS — AgentCore Runtime

AgentCore Runtime · Claude 3.5 Sonnet
AgentCore Memory · TTL=24h, KMS CMK
AgentCore Gateway · OAuth2/OIDC, throttle

⚙️ AWS — Ferramentas (Tool-use)

Lambda: Market Data · Bloomberg API proxy
Lambda: Risk Calc · VaR engine
Knowledge Base · OpenSearch + S3

📊 AWS — Observabilidade & Auditoria

X-Ray · span por tool-call
CloudTrail · API audit log
CloudWatch · SLO dashboards

Concrete Configuration: What Actually Matters

Observability: What to Measure and How

Agent business metrics (via CloudWatch custom metrics with namespace FinancialAgent):

TurnsPerSession — histogram; alert if p95 > 15 turns (indicates loop or poorly calibrated prompt)
TokensPerSession — histogram; alert if p95 > 40k tokens
ToolCallFailureRate per ToolName — counter; SLO of < 1% failure for critical tools
GuardrailInterventionRate — counter; spike indicates jailbreak attempt or prompt injection

Consequences and Risks You Need to Accept

Real Reference Numbers

~40%

Token reduction with SESSION_SUMMARY

In sessions with more than 20 turns, SESSION_SUMMARY reduces context sent to the model by ~40% vs. full history

100-300ms

Latency added by Guardrails per turn

Each Guardrail evaluation (input + output) adds 100-300ms; across 10 turns, up to 3s accumulated

7 anos

Minimum CloudTrail retention for financial compliance

S3 Object Lock in COMPLIANCE mode with 7 years meets Banco Central do Brasil and SEC requirements for agent audit trails

Well-Architected Assessment

Security

Reliability

Performance efficiency

Cost optimization

What the AWS Blog Doesn't Tell You

AWS service launch blogs are excellent at showing the happy path. What they rarely cover are the edge cases you only discover in production. Here are the three that cost me the most time:

Anti-Patterns I've Seen in Architecture Reviews

Using AgentCore without configuring Guardrails because "it's an internal environment" — insiders are the primary source of compliance incidents in finance
Storing full session history in memory without SESSION_SUMMARY — token cost grows linearly with number of turns
Implementing critical business logic inside the agent system prompt instead of in testable tools — prompts don't have unit tests
Not requesting concurrent session quota increase before go-live — ThrottlingException during peak usage is predictable and preventable
Assuming AgentCore Gateway replaces a business authorization layer — the Gateway controls access to the tool, not the authorization logic inside the tool
Not implementing idempotency in tool Lambdas with side effects — runtime retries can duplicate irreversible actions

Curator's Note

Senior Solutions Architect

Verdict: Adopt with Explicit Controls

References

#bedrock#agentcore#ai-agents#adr#financial-grade#guardrails#observability#aws

Analyzed source: Building AI agents with Amazon Bedrock AgentCore

Architecture newsletter

Architecture intelligence, in your inbox

Curated signals and original analysis on AWS, AI, distributed systems and the market — the way a solutions architect reads them.

Curated AWS · AI · architecture · market signals
New architecture studies & deep-dives when they ship
Sharp summaries — depth without the noise
No spam · double opt-in · unsubscribe anytime