Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

PlaybookAWS / RAG

Playbook: Vector Store on AWS — OpenSearch Serverless vs Aurora pgvector vs S3 Vectors

Jul 25, 2025 8 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

Picking the wrong vector store is where your RAG bill explodes — or where latency delivers unacceptable p99 in production. This playbook maps the three main AWS paths (OpenSearch Serverless, Aurora pgvector, S3 Vectors) across three real axes: required latency, cost model, and operational burden. Leave with a defensible decision, not the tutorial default.

Every RAG tutorial points to a vector store. No tutorial tells you what happens when traffic disappears at 3 AM and you're still paying for idle OCUs, or when your HNSW index grows without tuning and latency silently doubles. The vector store choice is a cost, latency, and operations decision — not a feature list comparison. This playbook gives you the three decision axes and a clear path for each workload profile.

What you'll be able to decide after this playbook

Which vector store to choose given your required p99 latency, vector volume, and access profile (hot vs archive)

When OpenSearch Serverless idle cost is justified — and when it kills your budget

If you already have Aurora PostgreSQL, whether it's worth migrating to a dedicated store or not

The exact workload profile S3 Vectors was built for (and what it is not)

The three anti-patterns that surface in production and how to avoid them before deploy

Quick Reference — the three stores in numbers

OpenSearch Serverless — minimum cost (OCU): 0.24 USD/OCU-hour; minimum 2 indexing OCUs + 2 search OCUs = ~$345/month at rest (us-east-1, estimate)
Aurora pgvector — supported dimensions: Up to 16,000 dimensions per vector (pgvector ≥ 0.7); HNSW and IVFFlat indexes available
S3 Vectors — pricing model: Charged per GB stored + per query request; no idle compute cost (GA 2025)
Native integration with Bedrock Knowledge Bases: OpenSearch Serverless: yes (native). Aurora pgvector: yes (via RDS Data API). S3 Vectors: yes (GA 2025)
Hybrid search (vector + lexical): OpenSearch Serverless: yes (BM25 + kNN native). Aurora pgvector: manual (pg_trgm + pgvector). S3 Vectors: not native
Operational model: OpenSearch Serverless: fully managed. Aurora pgvector: managed (RDS), but index tuning is yours. S3 Vectors: fully managed

The mental model that unlocks the decision

Think of vector stores the way you think about databases: there is no best — there is the right one for the access profile. The classic mistake is treating the choice as a features decision ("which supports more dimensions?") when in practice it is a decision across three orthogonal axes:

Axis 1 — Required latency (p99): What is your application's SLA? An interactive chatbot tolerates ~200ms p99 on vector search. A batch reranking pipeline can tolerate seconds. An autonomous agent making multiple searches per turn needs consistent latency, not just a low median. p99 is the number that matters, not p50.

Axis 2 — Cost model (idle vs volume): OpenSearch Serverless charges per OCU-hour regardless of load — you pay the floor, not usage. Aurora pgvector charges for the RDS instance (which you probably already have). S3 Vectors charges per storage and per query — near-zero cost at rest, grows with query volume. If your load is highly variable or seasonal, the cost model dominates the decision.

Axis 3 — Operational burden (managed vs you tune): OpenSearch Serverless abstracts sharding, replication, and scaling. S3 Vectors is object + query, no visible infrastructure. Aurora pgvector gives you full control — and full responsibility: you need to choose between HNSW and IVFFlat, set m and ef_construction, monitor index bloat, and understand that a poorly timed VACUUM can degrade latency in production.

Most architecture mistakes I see come from ignoring Axis 2 (idle cost) or Axis 3 (index tuning). Tutorials use OpenSearch Serverless because it's the lowest-friction path for a demo. Production is different.

Head-to-head comparison — the three AWS vector stores

	Criterion	OpenSearch Serverless	Aurora pgvector	S3 Vectors
Best for	—	Agentic RAG, hybrid search, rich filters, variable load with ms SLA	You already have Postgres; vectors alongside relational data; joins with business metadata	Huge corpus (billions of vectors), sporadic access, storage cost dominant
Typical p99 latency (ANN search)	—	10–80ms (warmed load, adequate OCUs)	5–150ms (depends on index, instance, and vacuum)	50–500ms+ (archive profile; cold start possible)
Cost model	—	OCU-hour (minimum ~$345/month idle); auto-scales up	RDS instance + storage; you already pay if the DB exists	GB stored + per query; near-zero idle cost
Hybrid search (vector + lexical)	—	Native (BM25 + kNN, normalization pipeline)	Manual (pg_trgm + pgvector; you write the SQL)	Not natively available
Metadata filters	—	Rich (pre-filter and post-filter, nested fields)	Full via SQL (native WHERE clause)	Supported by bucket/prefix; advanced filters limited
Operational burden	—	Low (fully managed; no exposed index tuning)	Medium-high (HNSW/IVFFlat tuning, vacuum, bloat, instance sizing)	Low (fully managed; object + query model)
Bedrock Knowledge Bases integration	—	Native and mature (default option in tutorials)	Yes (via RDS Data API)	Yes (GA 2025)
When NOT to use	—	Low/sporadic load where minimum cost isn't justified; tight budget in dev/staging	No existing Postgres; team with no index tuning experience; volume > 100M vectors without sharding	ms SLA for interactive production; hybrid search required; complex metadata filters

OpenSearch Serverless: the power and the trap of the cost floor

OpenSearch Serverless with the vector engine is genuinely the most capable store for production RAG with complex requirements. Native hybrid search (BM25 + kNN with normalization pipeline), pre and post filters, direct Bedrock Knowledge Bases integration, and automatic OCU scaling are real differentiators — not marketing.

The problem is the cost model. Each vector search collection requires a minimum of 2 indexing OCUs and 2 search OCUs. At $0.24/OCU-hour, that's approximately $345/month at rest, without a single query. In development, staging, or applications with low and variable load, that floor is unacceptable. I've seen teams running three environments (dev, staging, prod) with OpenSearch Serverless and paying ~$1,000/month before processing a single real vector.

Automatic scaling is real but asymmetric: it scales up fast when load increases, scales down slowly. Load spikes can provision additional OCUs that take hours to scale back down. Monitor SearchOCUs and IndexingOCUs in CloudWatch and configure billing alerts before going to production.

For the right profile — agentic RAG with multiple searches per turn, hybrid search, rich filters, reasonably constant load above the floor — it's the correct choice and the cost is justified. For everything else, evaluate the other two paths first.

Aurora pgvector: the power of Postgres — and the tuning nobody does

If you already operate Aurora PostgreSQL, adding pgvector is the lowest-friction decision possible: one extension, one column type, one index. You keep ACID transactions, joins with relational data, row-level security access control, and the entire toolchain your team already knows. For cases where vectors need to live alongside relational metadata — versioned documents, per-user permissions, joins with business tables — this co-location eliminates an entire class of consistency problems.

What most teams ignore is index tuning. pgvector supports two types:

IVFFlat: divides the vector space into lists (lists), searches the closest ones (probes). Faster to build, less precise. Requires the index to be built after data is loaded (otherwise lists are unbalanced). Critical parameter: ivfflat.probes — increasing it improves recall but hurts latency.

HNSW: hierarchical navigable small world graph. Better recall and latency, but consumes more memory and takes longer to build. Critical parameters: m (connections per node, default 16) and ef_construction (candidate queue size during build, default 64). For production with ms SLA, HNSW is the right choice — but you need to size shared_buffers and work_mem so the index fits in memory.

The silent problem is index bloat: frequent inserts and updates fragment the HNSW index. Periodic REINDEX CONCURRENTLY is necessary under continuous write loads. And a poorly configured AUTOVACUUM can block search queries at critical moments.

Practical rule: if you don't have someone on the team who knows what ef_search is and when to adjust it, think twice before choosing pgvector for a p99 < 100ms SLA in production.

Decision Matrix — which vector store for your case

OpenSearch Serverless (Vector Engine)

Pros

Native hybrid search (BM25 + kNN) without extra code
Rich metadata filters (pre/post-filter) native
Automatic OCU scaling; no exposed index tuning
Native and mature integration with Bedrock Knowledge Bases
Ideal for agentic RAG with multiple searches per turn

Cons

Minimum cost ~$345/month even without queries (2+2 OCUs)
Asymmetric scale-down: scales up fast, down slowly
No access to underlying index; limited recall debugging
Prohibitive cost for dev/staging if not shared

USE when: production RAG with hybrid search, complex filters, constant load above cost floor. DON'T USE when: sporadic load, tight budget, or you only need simple similarity.

Aurora PostgreSQL + pgvector

Pros

Zero additional cost if Aurora already exists in the stack
Native joins with relational data (metadata, permissions, versions)
ACID, row-level security, familiar Postgres toolchain
Full index control (HNSW vs IVFFlat, parameters)
Supports up to 16,000 dimensions (pgvector ≥ 0.7)

Cons

Index tuning is your responsibility (HNSW params, vacuum, bloat)
No native hybrid search; pg_trgm is a workaround, not a solution
Vector scalability limited by RDS instance (no automatic sharding)
Index bloat under continuous write loads requires active maintenance

USE when: Aurora already exists, vectors need relational joins, team has Postgres tuning experience. DON'T USE when: you need real hybrid search, volume > 50-100M vectors, or team lacks index expertise.

S3 Vectors

Pros

Near-zero idle cost (pays per GB + per query, not per compute)
Scales to billions of vectors without infrastructure management
Fully managed; simple object + query model
Integration with Bedrock Knowledge Bases (GA 2025)
Ideal for batch pipelines, archive corpora, historical embeddings

Cons

Higher p99 latency; not suitable for ms SLA in interactive production
No native hybrid search
Advanced metadata filters limited compared to OpenSearch
Newer product (GA 2025); tooling ecosystem still maturing

USE when: huge volume, sporadic access, storage cost is the driver, batch pipeline or offline RAG. DON'T USE when: ms latency is a requirement, hybrid search is needed, or the application is real-time interactive.

How to decide: 5 questions in order

1
Step 1: What is your required p99 latency?
If p99 < 200ms in interactive production → eliminate S3 Vectors. If p99 can be seconds (batch, offline pipeline) → S3 Vectors is a strong candidate. Test: define the SLA before choosing the store, not after.
2
Step 2: Do you already have Aurora PostgreSQL in production?
If yes → evaluate pgvector first. Calculate existing instance cost vs OpenSearch Serverless minimum cost. If vectors need joins with relational data → pgvector is strongly favored. If no Postgres → pgvector is not the lowest-friction path.
3
Step 3: Do you need hybrid search (vector + lexical)?
If yes → OpenSearch Serverless is the only one of the three with real native support. pgvector + pg_trgm works but is a workaround you'll maintain. S3 Vectors doesn't support it. Validate: test recall with real queries before assuming pure vector search is sufficient.
4
Step 4: What is the load profile (constant vs sporadic)?
Calculate: (hours/month with real load) × (OCU-hour OpenSearch) vs monthly minimum cost. If load is < 50% of the time → OpenSearch Serverless idle cost probably isn't justified. Highly seasonal load → S3 Vectors or pgvector (if it already exists) are more cost-efficient.
5
Step 5: Does your team have index tuning capability?
If choosing pgvector: define who owns HNSW params, vacuum schedule, and bloat monitoring before deploy. If nobody on the team knows what ef_construction is → either invest in training or choose OpenSearch Serverless. No middle ground: a poorly tuned pgvector index in production is a latency time bomb.

Decision tree — vector store on AWS

Decision flow across three axes: p99 latency, existing Postgres, and load/cost profile. Each decision node leads to a recommended store or an additional qualification.

🎯 Entrada

Preciso de · um vector store · para RAG na AWS

⏱️ Eixo 1: Latência

p99 < 200ms · em produção · interativa?
S3 Vectors · ✓ Batch / Arquivo · ✓ Custo mínimo · ✓ Bilhões de vetores

🗄️ Eixo 2: Postgres existente

Aurora PostgreSQL · já existe · no stack?
Aurora pgvector · ✓ Joins relacionais · ✓ Custo zero adicional · ⚠️ Tuning necessário

🔍 Eixo 3: Busca híbrida + Custo

Busca híbrida · ou filtros · ricos?
Carga constante · > 50% do tempo · ou SLA crítico?

✅ Recomendações

OpenSearch Serverless · ✓ Híbrido nativo · ✓ Filtros ricos · ✓ RAG agêntico · ⚠️ ~$345/mês mínimo
Aurora pgvector · ✓ Já tem Postgres · ✓ Custo eficiente · ⚠️ Tuning obrigatório
OpenSearch Serverless · (carga justifica custo)
S3 Vectors · (custo ocioso inaceitável)

Anti-patterns that surface in production

1. Choosing by tutorial, not by load profile. OpenSearch Serverless is the default in all Bedrock Knowledge Bases examples. That doesn't mean it's the right choice for you. If your load is sporadic or you're in dev/staging, the minimum cost will appear on your bill every month without delivering proportional value. 2. Ignoring OpenSearch Serverless idle cost. Automatic scaling goes up fast and comes down slowly. A load spike at 2 PM can keep OCUs provisioned until 5 PM. Without billing alerts and without monitoring SearchOCUs, you discover the problem at the end of the month. Configure aws cloudwatch put-metric-alarm for OCU count before go-live. 3. Deploying pgvector without index tuning. The default pgvector behavior without an index is exact search (sequential scan) — works perfectly in development with 10,000 vectors, explodes in production with 10 million. Without explicit CREATE INDEX USING hnsw and without configuring ef_search at runtime, you'll get second-level latency where you expected milliseconds. And the worst part: it will work in tests and only fail under real load.

Rule of thumb

If you're paying for compute, demand the p99. If you're paying for storage, accept the latency. OpenSearch Serverless charges for compute (OCU-hour) — demand ms latency in return. S3 Vectors charges for storage and query — accept higher latency as the cost trade-off. Aurora pgvector charges for the instance you already have — incremental cost is low, but operational cost (tuning) is high. Map what you're paying for and what you're getting in return.

My perspective — what I actually do in practice

Senior Solutions Architect

In most projects I architect, the vector store decision is determined before any benchmark: I look at the existing stack first. If there's Aurora PostgreSQL with reasonable load, I start with pgvector — the extension is available, incremental cost is near zero, and the team already knows how to operate Postgres. I document the chosen HNSW parameters, create an index maintenance runbook, and monitor pg_stat_user_indexes for bloat. This solves 60% of cases. For the other 40% — when there's no Postgres, when hybrid search is a real (not aspirational) requirement, or when the RAG is agentic with multiple searches per turn — I use OpenSearch Serverless. But never without first calculating the monthly minimum cost and presenting it to the client/stakeholder as a fixed infrastructure cost, not a variable cost. S3 Vectors I reserve for large-scale embedding pipelines (historical corpus ingestion, legacy document embeddings) where access is sporadic and volume is large. It's a new product and I wouldn't yet put it as the primary store for an interactive production RAG without extensive latency benchmarks for the specific case. What I never do: choose the store by tutorial without going through the five decision steps. The RAG bill explodes exactly there.

Verdict

There is no right vector store — there is the right one for your profile. OpenSearch Serverless is the most capable for complex RAG, but you pay the floor every month regardless of usage. Aurora pgvector is the most cost-efficient choice if Postgres already exists, but index tuning is your problem. S3 Vectors is the right choice for volume and rest cost, but not for ms p99. Decide by the three axes — latency, cost, operations — not by the tutorial's feature list. The RAG bill explodes when you choose by the lowest-friction demo path and discover the real cost model in production.

References

Amazon OpenSearch Serverless — Vector engine documentation Amazon Aurora PostgreSQL — pgvector documentation Amazon S3 Vectors — User Guide AWS Blog — Choosing a vector store for RAG

#vector-store#RAG#OpenSearch#pgvector#S3-Vectors#AWS#GenAI#architecture

Case sources

Amazon OpenSearch Serverless — Vector engine Amazon Aurora PostgreSQL — pgvector Amazon S3 Vectors AWS — Choosing a vector store for RAG

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

PlaybookAWS / RAG

Playbook: Vector Store on AWS — OpenSearch Serverless vs Aurora pgvector vs S3 Vectors

Jul 25, 2025 8 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

What you'll be able to decide after this playbook

Which vector store to choose given your required p99 latency, vector volume, and access profile (hot vs archive)

When OpenSearch Serverless idle cost is justified — and when it kills your budget

If you already have Aurora PostgreSQL, whether it's worth migrating to a dedicated store or not

The exact workload profile S3 Vectors was built for (and what it is not)

The three anti-patterns that surface in production and how to avoid them before deploy

Quick Reference — the three stores in numbers

OpenSearch Serverless — minimum cost (OCU): 0.24 USD/OCU-hour; minimum 2 indexing OCUs + 2 search OCUs = ~$345/month at rest (us-east-1, estimate)
Aurora pgvector — supported dimensions: Up to 16,000 dimensions per vector (pgvector ≥ 0.7); HNSW and IVFFlat indexes available
S3 Vectors — pricing model: Charged per GB stored + per query request; no idle compute cost (GA 2025)
Native integration with Bedrock Knowledge Bases: OpenSearch Serverless: yes (native). Aurora pgvector: yes (via RDS Data API). S3 Vectors: yes (GA 2025)
Hybrid search (vector + lexical): OpenSearch Serverless: yes (BM25 + kNN native). Aurora pgvector: manual (pg_trgm + pgvector). S3 Vectors: not native
Operational model: OpenSearch Serverless: fully managed. Aurora pgvector: managed (RDS), but index tuning is yours. S3 Vectors: fully managed

The mental model that unlocks the decision

Head-to-head comparison — the three AWS vector stores

	Criterion	OpenSearch Serverless	Aurora pgvector	S3 Vectors
Best for	—	Agentic RAG, hybrid search, rich filters, variable load with ms SLA	You already have Postgres; vectors alongside relational data; joins with business metadata	Huge corpus (billions of vectors), sporadic access, storage cost dominant
Typical p99 latency (ANN search)	—	10–80ms (warmed load, adequate OCUs)	5–150ms (depends on index, instance, and vacuum)	50–500ms+ (archive profile; cold start possible)
Cost model	—	OCU-hour (minimum ~$345/month idle); auto-scales up	RDS instance + storage; you already pay if the DB exists	GB stored + per query; near-zero idle cost
Hybrid search (vector + lexical)	—	Native (BM25 + kNN, normalization pipeline)	Manual (pg_trgm + pgvector; you write the SQL)	Not natively available
Metadata filters	—	Rich (pre-filter and post-filter, nested fields)	Full via SQL (native WHERE clause)	Supported by bucket/prefix; advanced filters limited
Operational burden	—	Low (fully managed; no exposed index tuning)	Medium-high (HNSW/IVFFlat tuning, vacuum, bloat, instance sizing)	Low (fully managed; object + query model)
Bedrock Knowledge Bases integration	—	Native and mature (default option in tutorials)	Yes (via RDS Data API)	Yes (GA 2025)
When NOT to use	—	Low/sporadic load where minimum cost isn't justified; tight budget in dev/staging	No existing Postgres; team with no index tuning experience; volume > 100M vectors without sharding	ms SLA for interactive production; hybrid search required; complex metadata filters

OpenSearch Serverless: the power and the trap of the cost floor

Aurora pgvector: the power of Postgres — and the tuning nobody does

What most teams ignore is index tuning. pgvector supports two types:

IVFFlat: divides the vector space into lists (lists), searches the closest ones (probes). Faster to build, less precise. Requires the index to be built after data is loaded (otherwise lists are unbalanced). Critical parameter: ivfflat.probes — increasing it improves recall but hurts latency.

HNSW: hierarchical navigable small world graph. Better recall and latency, but consumes more memory and takes longer to build. Critical parameters: m (connections per node, default 16) and ef_construction (candidate queue size during build, default 64). For production with ms SLA, HNSW is the right choice — but you need to size shared_buffers and work_mem so the index fits in memory.

Practical rule: if you don't have someone on the team who knows what ef_search is and when to adjust it, think twice before choosing pgvector for a p99 < 100ms SLA in production.

Decision Matrix — which vector store for your case

OpenSearch Serverless (Vector Engine)

Pros

Native hybrid search (BM25 + kNN) without extra code
Rich metadata filters (pre/post-filter) native
Automatic OCU scaling; no exposed index tuning
Native and mature integration with Bedrock Knowledge Bases
Ideal for agentic RAG with multiple searches per turn

Cons

Minimum cost ~$345/month even without queries (2+2 OCUs)
Asymmetric scale-down: scales up fast, down slowly
No access to underlying index; limited recall debugging
Prohibitive cost for dev/staging if not shared

USE when: production RAG with hybrid search, complex filters, constant load above cost floor. DON'T USE when: sporadic load, tight budget, or you only need simple similarity.

Aurora PostgreSQL + pgvector

Pros

Zero additional cost if Aurora already exists in the stack
Native joins with relational data (metadata, permissions, versions)
ACID, row-level security, familiar Postgres toolchain
Full index control (HNSW vs IVFFlat, parameters)
Supports up to 16,000 dimensions (pgvector ≥ 0.7)

Cons

Index tuning is your responsibility (HNSW params, vacuum, bloat)
No native hybrid search; pg_trgm is a workaround, not a solution
Vector scalability limited by RDS instance (no automatic sharding)
Index bloat under continuous write loads requires active maintenance

S3 Vectors

Pros

Near-zero idle cost (pays per GB + per query, not per compute)
Scales to billions of vectors without infrastructure management
Fully managed; simple object + query model
Integration with Bedrock Knowledge Bases (GA 2025)
Ideal for batch pipelines, archive corpora, historical embeddings

Cons

Higher p99 latency; not suitable for ms SLA in interactive production
No native hybrid search
Advanced metadata filters limited compared to OpenSearch
Newer product (GA 2025); tooling ecosystem still maturing

How to decide: 5 questions in order

1
Step 1: What is your required p99 latency?
If p99 < 200ms in interactive production → eliminate S3 Vectors. If p99 can be seconds (batch, offline pipeline) → S3 Vectors is a strong candidate. Test: define the SLA before choosing the store, not after.
2
Step 2: Do you already have Aurora PostgreSQL in production?
If yes → evaluate pgvector first. Calculate existing instance cost vs OpenSearch Serverless minimum cost. If vectors need joins with relational data → pgvector is strongly favored. If no Postgres → pgvector is not the lowest-friction path.
3
Step 3: Do you need hybrid search (vector + lexical)?
If yes → OpenSearch Serverless is the only one of the three with real native support. pgvector + pg_trgm works but is a workaround you'll maintain. S3 Vectors doesn't support it. Validate: test recall with real queries before assuming pure vector search is sufficient.
4
Step 4: What is the load profile (constant vs sporadic)?
Calculate: (hours/month with real load) × (OCU-hour OpenSearch) vs monthly minimum cost. If load is < 50% of the time → OpenSearch Serverless idle cost probably isn't justified. Highly seasonal load → S3 Vectors or pgvector (if it already exists) are more cost-efficient.
5
Step 5: Does your team have index tuning capability?
If choosing pgvector: define who owns HNSW params, vacuum schedule, and bloat monitoring before deploy. If nobody on the team knows what ef_construction is → either invest in training or choose OpenSearch Serverless. No middle ground: a poorly tuned pgvector index in production is a latency time bomb.

Decision tree — vector store on AWS

Decision flow across three axes: p99 latency, existing Postgres, and load/cost profile. Each decision node leads to a recommended store or an additional qualification.

🎯 Entrada

Preciso de · um vector store · para RAG na AWS

⏱️ Eixo 1: Latência

p99 < 200ms · em produção · interativa?
S3 Vectors · ✓ Batch / Arquivo · ✓ Custo mínimo · ✓ Bilhões de vetores

🗄️ Eixo 2: Postgres existente

Aurora PostgreSQL · já existe · no stack?
Aurora pgvector · ✓ Joins relacionais · ✓ Custo zero adicional · ⚠️ Tuning necessário

🔍 Eixo 3: Busca híbrida + Custo

Busca híbrida · ou filtros · ricos?
Carga constante · > 50% do tempo · ou SLA crítico?

✅ Recomendações

OpenSearch Serverless · ✓ Híbrido nativo · ✓ Filtros ricos · ✓ RAG agêntico · ⚠️ ~$345/mês mínimo
Aurora pgvector · ✓ Já tem Postgres · ✓ Custo eficiente · ⚠️ Tuning obrigatório
OpenSearch Serverless · (carga justifica custo)
S3 Vectors · (custo ocioso inaceitável)

Anti-patterns that surface in production

Rule of thumb

My perspective — what I actually do in practice

Senior Solutions Architect

Verdict

References

Amazon OpenSearch Serverless — Vector engine documentation Amazon Aurora PostgreSQL — pgvector documentation Amazon S3 Vectors — User Guide AWS Blog — Choosing a vector store for RAG

#vector-store#RAG#OpenSearch#pgvector#S3-Vectors#AWS#GenAI#architecture

Case sources

Amazon OpenSearch Serverless — Vector engine Amazon Aurora PostgreSQL — pgvector Amazon S3 Vectors AWS — Choosing a vector store for RAG

Liked this study? Get the next one.

Post-mortems, ADRs and architecture deep dives in your inbox — the way an architect reads them.

No spam · unsubscribe anytime

Written with AI assistance from the public case and my architect's reading.

Ask Fernando about this

Get a focused answer about this study from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

Listen to study

What you'll be able to decide after this playbook

Quick Reference — the three stores in numbers

The mental model that unlocks the decision

Head-to-head comparison — the three AWS vector stores

OpenSearch Serverless: the power and the trap of the cost floor

Aurora pgvector: the power of Postgres — and the tuning nobody does

Decision Matrix — which vector store for your case

OpenSearch Serverless (Vector Engine)

Aurora PostgreSQL + pgvector

S3 Vectors

How to decide: 5 questions in order

Step 1: What is your required p99 latency?

Step 2: Do you already have Aurora PostgreSQL in production?

Step 3: Do you need hybrid search (vector + lexical)?

Step 4: What is the load profile (constant vs sporadic)?

Step 5: Does your team have index tuning capability?

Decision tree — vector store on AWS

Anti-patterns that surface in production

Rule of thumb

Verdict

References

Ask Fernando about this

Join the conversation

Listen to study

What you'll be able to decide after this playbook

Quick Reference — the three stores in numbers

The mental model that unlocks the decision

Head-to-head comparison — the three AWS vector stores

OpenSearch Serverless: the power and the trap of the cost floor

Aurora pgvector: the power of Postgres — and the tuning nobody does

Decision Matrix — which vector store for your case

OpenSearch Serverless (Vector Engine)

Aurora PostgreSQL + pgvector

S3 Vectors

How to decide: 5 questions in order

Step 1: What is your required p99 latency?

Step 2: Do you already have Aurora PostgreSQL in production?

Step 3: Do you need hybrid search (vector + lexical)?

Step 4: What is the load profile (constant vs sporadic)?

Step 5: Does your team have index tuning capability?

Decision tree — vector store on AWS

Anti-patterns that surface in production

Rule of thumb

Verdict

References

Ask Fernando about this

Join the conversation