Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

PlaybookIA / AWS

Playbook: Which AWS AI Service to Use — The Decision Tree

Feb 10, 2026 9 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

Bedrock, SageMaker, Amazon Q, AgentCore, and self-hosted GPU solve different problems — but hype pushes everyone toward Bedrock by default. This playbook delivers a decision tree, a trade-off matrix, and rules of thumb so you choose by the problem, not the trend.

The wrong question is 'which AWS AI service is best?' The right question is 'what problem am I solving?' Five services, five distinct contexts — and the wrong choice costs months of rework, surprise bills, or a system that never needed to be built from scratch.

What you'll be able to decide after this playbook

If you need to train or serve a custom model → SageMaker AI

If you want to consume a ready-made LLM/embedding via API → Bedrock

If you're building a production agent (tools, memory, gateway, observability) → Bedrock AgentCore

If the use case is internal productivity over corporate data or AWS consoles → Amazon Q

If volume, data sovereignty, or per-token cost at scale are hard constraints → self-hosted EKS + GPU

How to avoid the three most common anti-patterns I see in real projects

Quick Reference — Services in Scope

Services covered: Amazon Bedrock, SageMaker AI, Amazon Q, Bedrock AgentCore, Self-hosted (EKS/EC2 GPU)
Domain: Generative AI and ML on AWS
Bedrock pricing model: Pay-per-token (on-demand) or Provisioned Throughput (reserved capacity)
SageMaker pricing model: Per instance-hour (training + inference) + S3/EBS storage
Amazon Q Business pricing model: Per user/month (Lite ~$3, Pro ~$20 — verify current pricing)
Bedrock AgentCore: GA 2025; managed runtime for agents with tools, memory, gateway, and native observability
Self-hosted GPU: EC2 p4d/p5/g5 or EKS + Karpenter; high fixed cost, full control

The mental model that unlocks everything: abstraction layers vs. control layers

Think of the five services as a single axis: on one end, maximum abstraction and delivery speed; on the other, maximum control and cost efficiency at scale. No point on the axis is better than another — each is optimal for a maturity stage and a set of constraints.

Amazon Q sits at the abstraction extreme: you don't write AI code, don't manage models, don't think about tokens. You connect data sources, configure permissions, and deliver productivity. It's the right service when the problem is access to corporate knowledge, not building an AI system.

Amazon Bedrock is the next step down: you consume foundation models via API (Anthropic Claude, Meta Llama, Mistral, Amazon Titan, Cohere, Stability AI, and others) without managing inference infrastructure. The contract is simple — you send tokens, receive tokens, pay per token. It's the correct starting point for most LLM production use cases.

Bedrock AgentCore is the runtime layer for when plain Bedrock isn't enough: when your agent needs persistent memory across sessions, tool orchestration with retry and timeout, a security gateway for model calls, and structured observability (traces, spans, latency metrics per step). Without AgentCore, you build all of this by hand — and you invariably build it wrong the first time.

SageMaker AI is where you go when the model you need doesn't exist in the Bedrock catalog, or when you need fine-tuning with your proprietary data, or when managed inference latency doesn't meet your SLA. It's a complete ML platform — experiments, pipelines, feature store, model registry, inference endpoints. Powerful, but expensive in engineering time.

Self-hosted EKS + GPU is the control extreme: you operate the cluster, manage CUDA drivers, configure GPU node autoscaling, monitor VRAM utilization. Cost per token can be a fraction of Bedrock at high volumes, but operational cost and availability risk are entirely yours. It makes sense when you have a specific model not in Bedrock, regulatory constraints preventing data from leaving your VPC, or volume so high that pay-per-token becomes prohibitive.

Why the 'let's go Bedrock' default is a dangerous shortcut

Bedrock is excellent — but it became the 'React' of the AWS AI world: everyone uses it regardless of the problem. I see three recurring error patterns in real projects:

Error 1 — Bedrock for internal productivity that Q already solves. A team spends weeks building a custom RAG over internal documentation, integrating with Kendra or OpenSearch, managing chunking, embeddings, and retrieval. Amazon Q Business does exactly this with native connectors for S3, SharePoint, Confluence, Salesforce, and 40+ other sources — with permission control based on IAM and Active Directory groups. The build cost isn't justified when the product already exists.

Error 2 — Plain Bedrock for complex agents. You start with a simple agent on Bedrock — one tool, one prompt, it works. Then you add session memory (you implement in DynamoDB), then retry on failing tools (you implement in the application), then observability (you instrument with X-Ray manually), then rate limiting on the gateway (you implement in API Gateway). In six months you've hand-built AgentCore, with bugs the managed service already fixed. AgentCore exists to prevent exactly this plumbing accumulation.

Error 3 — Self-hosted before validating volume. Teams that estimate Bedrock cost at scale and conclude 'it'll get expensive' jump straight to EKS + GPU without having validated the product. The operational cost of a production GPU cluster — with high availability, model updates, VRAM monitoring, spot interruption management — is real and ongoing. The rule: validate the product on Bedrock, migrate to self-hosted when the monthly Bedrock bill justifies the operational investment.

When SageMaker is non-negotiable

SageMaker is frequently underestimated in GenAI discussions because recent focus is on foundation LLMs. But there are four scenarios where it's the only correct answer:

1. Fine-tuning with sensitive proprietary data. If you need to adapt a model with data that cannot leave your AWS account — medical, financial, legal data — SageMaker offers training within your VPC, with encryption at rest and in transit, without data flowing to third-party infrastructure. Bedrock offers fine-tuning for some models, but with less control over the execution environment.

2. Models not in the Bedrock catalog. If your use case requires a specific model — a specialized domain LLM, a custom computer vision model, a time-series forecasting model — SageMaker is where you train and serve. JumpStart accelerates the starting point with pre-trained models from HuggingFace and other repositories.

3. Traditional ML pipelines with structured features. For XGBoost models, neural networks on tabular data, recommendation models — SageMaker Pipelines, Feature Store, and Model Monitor form a cohesive platform that Bedrock simply doesn't cover.

4. Inference latency with aggressive SLA. A dedicated SageMaker endpoint (non-serverless) offers predictable P99 latency because you control the instance. Bedrock on-demand has latency variability that may be unacceptable for cases like real-time scoring of financial transactions. Bedrock Provisioned Throughput mitigates this, but at a fixed cost that often makes SageMaker competitive.

The SageMaker trap is operational overhead: you manage endpoints, monitor drift, update models, configure auto-scaling. For teams without mature MLOps, this operational cost is systematically underestimated.

Head-to-Head Comparison: 5 Services Across 6 Dimensions

	Dimension	Bedrock	AgentCore	SageMaker AI	Amazon Q
Best for	Consuming LLMs/embeddings via API; fast prototype to production	Production agents with tools, memory, gateway, and observability	Training/fine-tuning custom models; traditional ML; inference with aggressive SLA	Internal productivity: Q&A on corporate docs, code assistant, AWS assistant	High volume with prohibitive per-token cost; full data sovereignty; models not available in Bedrock
Ops effort	Low — no infrastructure to manage	Low-medium — managed runtime, but you configure tools and memory	High — endpoints, scaling, drift monitoring, model updates	Very low — connector and permission setup; no AI code	Very high — GPU cluster, CUDA, autoscaling, availability, security
Cost model	Pay-per-token (variable) or Provisioned Throughput (fixed per hour)	Pay-per-use runtime + cost of underlying Bedrock model	Per instance-hour (training + inference) + storage	Per user/month (predictable, SaaS-like)	Fixed GPU instance cost (high) + ops; better TCO at very high volume
Model control	Low — you use the model as-is; fine-tuning available for some models	Low — control over orchestration, not the model	High — you train, version, monitor, and replace the model	None — model managed by AWS	Total — you choose, operate, and update any model
When NOT to use	When you need a complex agent (use AgentCore); when Q already solves it; when volume makes per-token cost prohibitive	For simple LLM calls without orchestration; for internal productivity (use Q)	For consuming foundation models without customization (use Bedrock); for internal productivity (use Q)	For custom AI systems facing external customers; when you need control over the model or orchestration flow	Before validating the product and volume; when the team lacks capacity to operate GPU infrastructure
Recommended team maturity	Any team with AWS and API experience	Team with distributed systems and agent design experience	Team with MLOps or willingness to build that capability	Any team; ideal for teams without ML engineers	Senior team with Kubernetes, GPU ops, and cluster security experience

Decision Matrix: Pros, Cons, and Verdict per Service

Amazon Bedrock

Pros

Wide model catalog (Claude, Llama, Mistral, Titan, Cohere, Stability AI)
Zero infrastructure to manage; automatic scaling
Native guardrails, VPC endpoints, integrated CloudTrail
Correct starting point for 80% of LLM use cases

Cons

Per-token cost scales linearly — no automatic economies of scale
Variable latency on on-demand; Provisioned Throughput has high fixed cost
Limited model control; fine-tuning available for few models

Correct default for most projects. Start here.

Bedrock AgentCore

Pros

Managed runtime for agents: tools, memory, gateway, observability out-of-the-box
Eliminates plumbing you'd build manually (and incorrectly) with plain Bedrock
Native integration with Bedrock ecosystem (models, guardrails, Knowledge Bases)

Cons

Relatively new service (GA 2025) — fewer documented production use cases
Adds runtime cost on top of Bedrock model cost
Abstraction may limit very specific orchestration customizations

Use when your agent has more than one tool or needs memory. Don't reinvent AgentCore.

Amazon SageMaker AI

Pros

Complete ML platform: training, feature store, pipelines, model registry, inference
Full control over the model and execution environment
Predictable inference latency with dedicated endpoints
Fine-tuning with data that stays entirely in your account

Cons

High operational overhead — you manage endpoints, scaling, drift, updates
Steep learning curve for teams without MLOps
Instance cost runs even when there's no traffic (dedicated endpoints)

Non-negotiable for custom models and traditional ML. Don't use to consume foundation LLMs.

Amazon Q

Pros

Zero AI code — native connectors for 40+ corporate data sources
Access control based on IAM and AD groups — user only sees what they have permission for
Amazon Q Developer: code assistant integrated into IDE and AWS console
Predictable cost per user/month — easy to justify to HR/finance

Cons

No AI flow customization — you accept the product as-is
Not suitable for AI systems facing external customers
Dependency on available connectors — very custom sources require development

Vastly underused. If the problem is internal productivity, Q probably already solves it.

Decision Tree: Which AWS AI Service to Use

Each node is a qualification question. Follow the edges to the leaf — the recommended service. Questions are ordered by specificity: from most restrictive (custom training) to most general (LLM consumption).

🚦 Entrada / Entry

Novo caso · de uso de IA · New AI use case

❓ Perguntas de Qualificação / Qualification Questions

Q1: Precisa treinar · ou fine-tunar · modelo próprio? · Need to train/fine-tune · custom model?
Q2: Caso de uso é · produtividade interna · (docs, código, console)? · Internal productivity · use case?
Q3: Agente com tools, · memória ou · orquestração complexa? · Agent with tools, · memory, or complex · orchestration?
Q4: Volume alto + · soberania de dados · ou custo por token · proibitivo? · High volume + data · sovereignty or · prohibitive token cost?

✅ Serviços Recomendados / Recommended Services

SageMaker AI · Treinar · Servir · MLOps · Train · Serve · MLOps
Amazon Q · Business / Developer · Produtividade interna · Internal productivity
Bedrock AgentCore · Runtime de agentes · Agent runtime
Self-hosted · EKS + GPU · Controle total · Full control
Amazon Bedrock · API de LLM/Embeddings · LLM/Embeddings API · (default correto · correct default)

Qualification Checklist: 7 Questions Before Choosing the Service

1
1. Does the model I need exist in the Bedrock catalog?
Check at aws.amazon.com/bedrock. If it doesn't exist and you can't train an alternative, go to SageMaker or self-hosted. If it exists, continue.
2
2. Is the use case internal or external?
Internal (employees, devs, operations): evaluate Amazon Q before building anything. External (customers, partners, public APIs): Q doesn't apply — continue qualification.
3
3. Do I need fine-tuning with data that can't leave my account?
If yes: SageMaker is the safest option. Bedrock offers fine-tuning for some models, but with less environment control. Document the sovereignty requirement before deciding.
4
4. Does the system have multi-tool orchestration or persistent memory?
If yes: evaluate Bedrock AgentCore before building custom orchestration. Test whether AgentCore's abstractions meet your case — in most cases, they do.
5
5. What is the estimated token volume per month?
Calculate the monthly cost on Bedrock with the target model (aws.amazon.com/bedrock/pricing). If the cost is acceptable, use Bedrock. If prohibitive, compare with self-hosted TCO (GPU instance + ops + eng). Only migrate if the delta justifies it.
6
6. Are there regulatory data sovereignty constraints (LGPD, GDPR, financial sector)?
Bedrock operates within your AWS region and doesn't use your data to train models (by default). For harder constraints (data that cannot leave the VPC under any circumstances), self-hosted or SageMaker with VPC endpoint are the options.
7
7. Does the team have the capacity to operate the chosen service?
Be honest. Self-hosted without a senior platform team is guaranteed technical debt. SageMaker without MLOps is an orphaned endpoint in 6 months. Choose the service the team can operate with excellence, not the one that sounds most sophisticated.

Anti-patterns I repeatedly see in production

1. 'Let's go Bedrock' as an unreflective default. Bedrock is the correct starting point for LLMs — but not for internal productivity (use Q), not for complex agents without AgentCore (you'll reinvent the runtime), and not for traditional ML (use SageMaker). The question isn't 'Bedrock or not?' — it's 'what problem am I solving?' 2. Building custom RAG when Q already solves it. I've seen teams spend 6+ weeks building ingestion, chunking, embedding, and retrieval pipelines for internal documentation — when Amazon Q Business with an S3 or SharePoint connector would have delivered in days. Before building RAG, answer: 'Does Q Business cover this case?' Most of the time, it does. 3. Self-hosted before validating volume and product. Teams that estimate Bedrock cost at scale and jump to EKS + GPU without having validated that the product has traction. The cost of operating a GPU cluster in production — HA, updates, security, CUDA — is systematically underestimated. Validate the product on Bedrock. Migrate when the monthly bill justifies the operational investment. 4. Ignoring AgentCore and building orchestration by hand. Session memory in DynamoDB, tool retry in Lambda, rate limiting in API Gateway, manual traces in X-Ray — you're hand-building AgentCore, with bugs the managed service already fixed. If your agent has more than one tool, evaluate AgentCore before writing plumbing code.

Rule of Thumb

Start on Bedrock. Move up to AgentCore when it becomes an agent. Go to SageMaker or self-hosted only when cost or control justify it with real numbers. And before anything else: if the problem is internal productivity, ask whether Q already solves it — in most cases, it does.

My Senior Take

Senior Solutions Architect

After 16 years building production systems — including financial platforms where cost, latency, and data sovereignty are hard constraints — what surprises me most in the generative AI space is how quickly teams jump to the most complex solution without qualifying the problem. Amazon Q Business is the most underused service I know on AWS today. Most companies have an internal knowledge access problem — scattered documentation, outdated wikis, slow onboarding — and the instinctive answer is 'let's build a RAG with Bedrock'. Q solves this with native connectors, granular permission control, and zero AI code. The opportunity cost of not evaluating Q before building is real. At the other extreme, I see teams reaching for self-hosted before having 1000 active users. The argument is always 'per-token cost will scale'. Yes, it will — but the cost of operating a GPU cluster with high availability, model updates, cluster security, and spot interruption management also scales, and it's a fixed cost you pay regardless of traffic. Do the break-even calculation honestly before committing the team to GPU infrastructure operations. Bedrock AgentCore is the addition that most changes the equation for teams building agents in 2025. Before it, the choice was between plain Bedrock (you build the runtime) and external frameworks like LangChain or LlamaIndex (you manage dependencies and versions). AgentCore solves the managed runtime with native observability — and for financial or regulated systems, the traceability of each agent step is not optional. My practical recommendation: use the decision tree in this playbook as a checklist in every AI project inception meeting. The five qualification questions take less than 10 minutes and prevent months of rework.

Verdict

There is no universally correct AWS AI service — there is the right service for the right problem, operated by the right team. The decision tree is simple: internal productivity goes to Q; LLM via API goes to Bedrock; agent with orchestration goes to AgentCore; custom model goes to SageMaker; extreme volume with sovereignty goes to self-hosted. The mistake isn't choosing the wrong service out of naivety — it's choosing by hype, without asking the five qualification questions. Choose by the problem. The right service is the one your team can operate with excellence and that solves the user's problem at the lowest total cost — not the most sophisticated, not the newest, not the one that appeared at the last re:Invent.

References

AWS — Amazon Bedrock AWS — Amazon SageMaker AI AWS — Amazon Q AWS — Amazon Bedrock AgentCore Amazon Bedrock — Pricing

#aws#bedrock#sagemaker#amazon-q#agentcore#genai#decision-tree#architecture

Case sources

AWS — Amazon Bedrock AWS — Amazon SageMaker AI AWS — Amazon Q AWS — Amazon Bedrock AgentCore Amazon Bedrock — Pricing

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

PlaybookIA / AWS

Playbook: Which AWS AI Service to Use — The Decision Tree

Feb 10, 2026 9 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

What you'll be able to decide after this playbook

If you need to train or serve a custom model → SageMaker AI

If you want to consume a ready-made LLM/embedding via API → Bedrock

If you're building a production agent (tools, memory, gateway, observability) → Bedrock AgentCore

If the use case is internal productivity over corporate data or AWS consoles → Amazon Q

If volume, data sovereignty, or per-token cost at scale are hard constraints → self-hosted EKS + GPU

How to avoid the three most common anti-patterns I see in real projects

Quick Reference — Services in Scope

Services covered: Amazon Bedrock, SageMaker AI, Amazon Q, Bedrock AgentCore, Self-hosted (EKS/EC2 GPU)
Domain: Generative AI and ML on AWS
Bedrock pricing model: Pay-per-token (on-demand) or Provisioned Throughput (reserved capacity)
SageMaker pricing model: Per instance-hour (training + inference) + S3/EBS storage
Amazon Q Business pricing model: Per user/month (Lite ~$3, Pro ~$20 — verify current pricing)
Bedrock AgentCore: GA 2025; managed runtime for agents with tools, memory, gateway, and native observability
Self-hosted GPU: EC2 p4d/p5/g5 or EKS + Karpenter; high fixed cost, full control

The mental model that unlocks everything: abstraction layers vs. control layers

Why the 'let's go Bedrock' default is a dangerous shortcut

Bedrock is excellent — but it became the 'React' of the AWS AI world: everyone uses it regardless of the problem. I see three recurring error patterns in real projects:

When SageMaker is non-negotiable

SageMaker is frequently underestimated in GenAI discussions because recent focus is on foundation LLMs. But there are four scenarios where it's the only correct answer:

Head-to-Head Comparison: 5 Services Across 6 Dimensions

	Dimension	Bedrock	AgentCore	SageMaker AI	Amazon Q
Best for	Consuming LLMs/embeddings via API; fast prototype to production	Production agents with tools, memory, gateway, and observability	Training/fine-tuning custom models; traditional ML; inference with aggressive SLA	Internal productivity: Q&A on corporate docs, code assistant, AWS assistant	High volume with prohibitive per-token cost; full data sovereignty; models not available in Bedrock
Ops effort	Low — no infrastructure to manage	Low-medium — managed runtime, but you configure tools and memory	High — endpoints, scaling, drift monitoring, model updates	Very low — connector and permission setup; no AI code	Very high — GPU cluster, CUDA, autoscaling, availability, security
Cost model	Pay-per-token (variable) or Provisioned Throughput (fixed per hour)	Pay-per-use runtime + cost of underlying Bedrock model	Per instance-hour (training + inference) + storage	Per user/month (predictable, SaaS-like)	Fixed GPU instance cost (high) + ops; better TCO at very high volume
Model control	Low — you use the model as-is; fine-tuning available for some models	Low — control over orchestration, not the model	High — you train, version, monitor, and replace the model	None — model managed by AWS	Total — you choose, operate, and update any model
When NOT to use	When you need a complex agent (use AgentCore); when Q already solves it; when volume makes per-token cost prohibitive	For simple LLM calls without orchestration; for internal productivity (use Q)	For consuming foundation models without customization (use Bedrock); for internal productivity (use Q)	For custom AI systems facing external customers; when you need control over the model or orchestration flow	Before validating the product and volume; when the team lacks capacity to operate GPU infrastructure
Recommended team maturity	Any team with AWS and API experience	Team with distributed systems and agent design experience	Team with MLOps or willingness to build that capability	Any team; ideal for teams without ML engineers	Senior team with Kubernetes, GPU ops, and cluster security experience

Decision Matrix: Pros, Cons, and Verdict per Service

Amazon Bedrock

Pros

Wide model catalog (Claude, Llama, Mistral, Titan, Cohere, Stability AI)
Zero infrastructure to manage; automatic scaling
Native guardrails, VPC endpoints, integrated CloudTrail
Correct starting point for 80% of LLM use cases

Cons

Per-token cost scales linearly — no automatic economies of scale
Variable latency on on-demand; Provisioned Throughput has high fixed cost
Limited model control; fine-tuning available for few models

Correct default for most projects. Start here.

Bedrock AgentCore

Pros

Managed runtime for agents: tools, memory, gateway, observability out-of-the-box
Eliminates plumbing you'd build manually (and incorrectly) with plain Bedrock
Native integration with Bedrock ecosystem (models, guardrails, Knowledge Bases)

Cons

Relatively new service (GA 2025) — fewer documented production use cases
Adds runtime cost on top of Bedrock model cost
Abstraction may limit very specific orchestration customizations

Use when your agent has more than one tool or needs memory. Don't reinvent AgentCore.

Amazon SageMaker AI

Pros

Complete ML platform: training, feature store, pipelines, model registry, inference
Full control over the model and execution environment
Predictable inference latency with dedicated endpoints
Fine-tuning with data that stays entirely in your account

Cons

High operational overhead — you manage endpoints, scaling, drift, updates
Steep learning curve for teams without MLOps
Instance cost runs even when there's no traffic (dedicated endpoints)

Non-negotiable for custom models and traditional ML. Don't use to consume foundation LLMs.

Amazon Q

Pros

Zero AI code — native connectors for 40+ corporate data sources
Access control based on IAM and AD groups — user only sees what they have permission for
Amazon Q Developer: code assistant integrated into IDE and AWS console
Predictable cost per user/month — easy to justify to HR/finance

Cons

No AI flow customization — you accept the product as-is
Not suitable for AI systems facing external customers
Dependency on available connectors — very custom sources require development

Vastly underused. If the problem is internal productivity, Q probably already solves it.

Decision Tree: Which AWS AI Service to Use

🚦 Entrada / Entry

Novo caso · de uso de IA · New AI use case

❓ Perguntas de Qualificação / Qualification Questions

Q1: Precisa treinar · ou fine-tunar · modelo próprio? · Need to train/fine-tune · custom model?
Q2: Caso de uso é · produtividade interna · (docs, código, console)? · Internal productivity · use case?
Q3: Agente com tools, · memória ou · orquestração complexa? · Agent with tools, · memory, or complex · orchestration?
Q4: Volume alto + · soberania de dados · ou custo por token · proibitivo? · High volume + data · sovereignty or · prohibitive token cost?

✅ Serviços Recomendados / Recommended Services

SageMaker AI · Treinar · Servir · MLOps · Train · Serve · MLOps
Amazon Q · Business / Developer · Produtividade interna · Internal productivity
Bedrock AgentCore · Runtime de agentes · Agent runtime
Self-hosted · EKS + GPU · Controle total · Full control
Amazon Bedrock · API de LLM/Embeddings · LLM/Embeddings API · (default correto · correct default)

Qualification Checklist: 7 Questions Before Choosing the Service

1
1. Does the model I need exist in the Bedrock catalog?
Check at aws.amazon.com/bedrock. If it doesn't exist and you can't train an alternative, go to SageMaker or self-hosted. If it exists, continue.
2
2. Is the use case internal or external?
Internal (employees, devs, operations): evaluate Amazon Q before building anything. External (customers, partners, public APIs): Q doesn't apply — continue qualification.
3
3. Do I need fine-tuning with data that can't leave my account?
If yes: SageMaker is the safest option. Bedrock offers fine-tuning for some models, but with less environment control. Document the sovereignty requirement before deciding.
4
4. Does the system have multi-tool orchestration or persistent memory?
If yes: evaluate Bedrock AgentCore before building custom orchestration. Test whether AgentCore's abstractions meet your case — in most cases, they do.
5
5. What is the estimated token volume per month?
Calculate the monthly cost on Bedrock with the target model (aws.amazon.com/bedrock/pricing). If the cost is acceptable, use Bedrock. If prohibitive, compare with self-hosted TCO (GPU instance + ops + eng). Only migrate if the delta justifies it.
6
6. Are there regulatory data sovereignty constraints (LGPD, GDPR, financial sector)?
Bedrock operates within your AWS region and doesn't use your data to train models (by default). For harder constraints (data that cannot leave the VPC under any circumstances), self-hosted or SageMaker with VPC endpoint are the options.
7
7. Does the team have the capacity to operate the chosen service?
Be honest. Self-hosted without a senior platform team is guaranteed technical debt. SageMaker without MLOps is an orphaned endpoint in 6 months. Choose the service the team can operate with excellence, not the one that sounds most sophisticated.

Anti-patterns I repeatedly see in production

Rule of Thumb

My Senior Take

Senior Solutions Architect

Verdict

References

AWS — Amazon Bedrock AWS — Amazon SageMaker AI AWS — Amazon Q AWS — Amazon Bedrock AgentCore Amazon Bedrock — Pricing

#aws#bedrock#sagemaker#amazon-q#agentcore#genai#decision-tree#architecture

Case sources

AWS — Amazon Bedrock AWS — Amazon SageMaker AI AWS — Amazon Q AWS — Amazon Bedrock AgentCore Amazon Bedrock — Pricing

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

Listen to study

What you'll be able to decide after this playbook

Quick Reference — Services in Scope

The mental model that unlocks everything: abstraction layers vs. control layers

Why the 'let's go Bedrock' default is a dangerous shortcut

When SageMaker is non-negotiable

Head-to-Head Comparison: 5 Services Across 6 Dimensions

Decision Matrix: Pros, Cons, and Verdict per Service

Amazon Bedrock

Bedrock AgentCore

Amazon SageMaker AI

Amazon Q

Decision Tree: Which AWS AI Service to Use

Qualification Checklist: 7 Questions Before Choosing the Service

1. Does the model I need exist in the Bedrock catalog?

2. Is the use case internal or external?

3. Do I need fine-tuning with data that can't leave my account?

4. Does the system have multi-tool orchestration or persistent memory?

5. What is the estimated token volume per month?

6. Are there regulatory data sovereignty constraints (LGPD, GDPR, financial sector)?

7. Does the team have the capacity to operate the chosen service?

Anti-patterns I repeatedly see in production

Rule of Thumb

Verdict

References

Ask Fernando about this

Join the conversation

Listen to study

What you'll be able to decide after this playbook

Quick Reference — Services in Scope

The mental model that unlocks everything: abstraction layers vs. control layers

Why the 'let's go Bedrock' default is a dangerous shortcut

When SageMaker is non-negotiable

Head-to-Head Comparison: 5 Services Across 6 Dimensions

Decision Matrix: Pros, Cons, and Verdict per Service

Amazon Bedrock

Bedrock AgentCore

Amazon SageMaker AI

Amazon Q

Decision Tree: Which AWS AI Service to Use

Qualification Checklist: 7 Questions Before Choosing the Service

1. Does the model I need exist in the Bedrock catalog?

2. Is the use case internal or external?

3. Do I need fine-tuning with data that can't leave my account?

4. Does the system have multi-tool orchestration or persistent memory?

5. What is the estimated token volume per month?

6. Are there regulatory data sovereignty constraints (LGPD, GDPR, financial sector)?

7. Does the team have the capacity to operate the chosen service?

Anti-patterns I repeatedly see in production

Rule of Thumb

Verdict

References

Ask Fernando about this

Join the conversation