Open Source
HCL

aws-ai-reference-architectures

Six AWS AI reference architectures — diagrams, IaC skeletons, and Well-Architected analysis.

0 0MITUpdated May 16, 2026
Share:
#ai#architecture#aws#bedrock#github-actions#mlops#moretes#portfolio#reference-architecture#sagemaker
git clone https://github.com/fernandofatech/aws-ai-reference-architectures.git

aws-ai-reference-architectures is a bilingual portfolio of six reference architectures for AI workloads on AWS, each with a Mermaid diagram, justified architectural decisions, cost analysis at three scales, a Well-Architected review, and a working Terraform skeleton.

Why this repository exists

Most publicly available AWS AI examples fall into one of two extremes: toy notebooks with no IaC or security, or 200-page enterprise white papers that nobody finishes reading. This repository deliberately occupies the middle ground.

Each architecture answers a fixed set of questions: what is the problem, which services and why, what are the three to five decisions that actually matter (with rationale and alternatives), what does it cost at S/M/L sizes, what does the Well-Architected Framework surface, and when not to use this pattern.

The format is consistent across all six architectures so you can compare patterns side by side or pull a specific section — for example, the cost table or the MADR-formatted decisions — without reading everything. The IaC is intentionally skeletal: it shows resources and wiring, but makes no attempt to be a generic Terraform module that no real team actually uses as-is.

What is included

Six patterns covered: RAG with Bedrock + OpenSearch, multi-agent orchestration, streaming inference, event-driven AI processing, fine-tuning pipeline with SageMaker/MLflow, and a secure agentic system with Guardrails.
Architectural decisions in MADR format — each decision includes context, considered options, rationale, and consequences, ready to copy into your own ADR.
Cost analysis at three scales — tables with explicit assumptions for S, M, and L sizes, useful in budget conversations with stakeholders.
Well-Architected review per architecture covering all six pillars — a complement to, not a replacement for, the formal AWS Well-Architected Tool.
Full CI pipeline with CodeQL, Trivy, Gitleaks, dependency review, and automatic deploy to GitHub Pages (docs) and Vercel (landing).
Bilingual site (PT/EN) published at ai-architectures.moretes.com, built with plain HTML/CSS/JS and no framework dependencies.

How the repository is organized

Each architecture is a self-contained folder with documentation, diagram, and IaC. Two publishing targets are maintained in parallel via GitHub Actions.

📁 Repo
  • architectures/ · 01–06
  • docs/ · MkDocs Material
  • frontend/ · Static landing
  • .github/workflows/ · CI + security
☁️ Publish
  • GitHub Pages · Docs site
  • Vercel · Landing site
🔒 Security
  • CodeQL · SAST
  • Trivy · FS scan
  • Gitleaks · Secret scan

The six architectures in detail

01 — RAG with Bedrock + OpenSearch covers the retrieval-augmented generation pattern for internal knowledge bases. It is the most common entry point for teams that want Q&A over corporate documentation without fine-tuning.

02 — Multi-agent orchestration uses Bedrock Agents combined with Step Functions for long-running workflows that need durable state across steps — the case where a single LLM call is not enough.

03 — Streaming AI inference shows how to wire API Gateway, Lambda, and Bedrock's native streaming for token-level chat UIs without polling.

04 — Event-driven AI processing is the async pattern: EventBridge + SQS + Lambda + Bedrock for classification, enrichment, and content moderation at scale, decoupled from the producer.

05 — Fine-tuning pipeline covers the full cycle with SageMaker, S3, and MLflow — data preparation, training, experiment tracking, and model promotion. Useful when a generic foundation model does not reach the required quality bar.

06 — Secure agentic system is the most complex pattern: Bedrock Agents with Guardrails, VPC isolation, and multi-tenant controls. It is the starting point for anyone who needs to put an agent in production with real security and compliance constraints.

How to install and use locally

  1. 1

    Clone the repository

    Run git clone https://github.com/fernandofatech/aws-ai-reference-architectures.git and enter the folder with cd aws-ai-reference-architectures.

  2. 2

    Read an architecture directly in the terminal or editor

    Each architecture is self-documented. Open, for example, architectures/01-rag-bedrock-opensearch/README.md in your editor or use cat to inspect the content. There are no runtime dependencies for reading.

  3. 3

    Serve the documentation site locally (optional)

    Install Python dependencies with pip install mkdocs-material and run mkdocs serve from the root. The site will be available at http://127.0.0.1:8000.

  4. 4

    Serve the catalog frontend locally (optional)

    Enter cd frontend and run any static HTTP server, for example python3 -m http.server 8080. There is no build step — the frontend is plain HTML/CSS/JS.

  5. 5

    Inspect and adapt the Terraform skeleton

    Each architecture folder contains a terraform/ subdirectory with .tf files declaring the main resources. Run terraform init inside any of them to validate the syntax. Adjust names, tags, IAM policies, and remote state before any terraform apply.

  6. 6

    Run security checks locally

    Install Trivy and Gitleaks and run trivy fs . and gitleaks detect --source . from the root to replicate the checks the CI runs on every push.

Full flow: clone, read, and serve the docs site
# Clone
git clone https://github.com/fernandofatech/aws-ai-reference-architectures.git
cd aws-ai-reference-architectures

# Read an architecture (no dependencies needed)
cat architectures/01-rag-bedrock-opensearch/README.md | less

# Serve the MkDocs documentation site locally
pip install mkdocs-material
mkdocs serve
# → open http://127.0.0.1:8000

# Serve the static catalog frontend
cd frontend && python3 -m http.server 8080
# → open http://127.0.0.1:8080

# Validate a Terraform skeleton (no AWS credentials needed for init)
cd ../architectures/04-event-driven-ai-processing/terraform
terraform init
terraform validate

The IaC is a skeleton, not a module

The Terraform files show the correct resources and wiring for each pattern, but they are not ready for terraform apply in production. Each team needs to adapt: resource names, mandatory tags, IAM policies with real least privilege, remote state configuration (S3 + DynamoDB), and integration with the existing CI/CD pipeline. Using the skeleton directly in production without these adaptations is a risk.

How the repository works internally

The repository has three content layers that are kept separate by design.

The architecture layer lives in architectures/ and is the core of the project. Each subfolder contains a structured README.md with the fixed sections described above, a diagram.mmd file with the Mermaid diagram, and a terraform/ directory with the IaC skeleton. All technical documentation lives here — the rest of the repository is publishing infrastructure.

The documentation layer in docs/ uses MkDocs Material to generate a navigable site published to GitHub Pages. The docs.yml workflow runs a strict build (fails on warnings) and deploys automatically on every push to main.

The frontend layer in frontend/ is a static catalog with no framework dependencies, connected to Vercel via Git integration. Automatic previews are generated for every pull request; production deploy happens on merge to main.

The CI pipeline covers four distinct concerns: frontend quality (lint + audit), documentation quality (strict build), security (CodeQL + Trivy + Gitleaks + dependency review), and maintenance (Dependabot for Actions and frontend dependencies). The workflows are independent — a security scan failure does not block the docs deploy, and vice versa.

Frequently asked questions

Can I use the Terraform code directly in a client project?

Yes, with adaptations. The skeleton shows the correct resources and wiring between them, but you will need to adjust IAM policies, resource names, tags, remote state, and pipeline integration before any deploy. Treat it as a starting point, not a ready-made module.

Why six architectures and not more?

Six patterns cover most AI workloads I encounter in practice. More patterns without the same depth of analysis do not add value. If your use case does not fit any of the six, open an issue with the context.

Does the Well-Architected review replace the AWS Well-Architected Tool?

No. The review here highlights the most relevant findings for each specific pattern and serves as preparation or a complement. The formal AWS Tool should be used before any production launch.

Does the site work without JavaScript?

The catalog frontend uses JavaScript for navigation features, but the main content is accessible. The documentation site (MkDocs Material) requires JavaScript for full search and navigation.

How to use in existing design reviews

For reviews of systems already in production (brownfield), go directly to the Well-Architected section of the architecture closest to your current pattern and compare point by point with what you have. The trade-offs sections are also useful for justifying or questioning past decisions in a technical review context.

Who this repository is for

This repository is useful for solutions architects and senior engineers who need a structured starting point for AI designs on AWS — not a tutorial, not a white paper, but something you can open in a design meeting and use directly. It is equally useful for anyone evaluating my work as an architect: every decision is justified, every trade-off is explicit, and the CI pipeline reflects the practices I apply in real projects. If you are getting started with AI on AWS and want to understand the patterns before writing code, start with architecture 01 or 04. If you already have a system in production and want to identify gaps, go to the Well-Architected section of the corresponding architecture.

References

Guide generated with AI from the repository and its README. · Source