The path· 8
- 1Advanced 11 minTPU Developer Hub: A Technical Review of a High-Performance AI PlatformThe TPU Developer Hub consolidates tooling, documentation, and high-performance ML stacks around Google Cloud TPUs. For architects operating in financial-grade environments with latency, cost, and governance demands, understanding where this platform delivers real value — and where it imposes operational friction — is essential before any migration commitment.
- 2Advanced 8 minAgent Evaluation as an Engineering DisciplineAI agent evaluation has moved beyond ad hoc prompt engineering into a full engineering discipline with versioned datasets, automated quality gates, and regression traceability. Bedrock AgentCore materializes that shift by bringing managed infrastructure to the agent testing lifecycle. For financial-grade systems architects, this changes the contract between ML teams and platform engineering.
- 3Advanced 12 minDocument Automation with Bedrock: A Modernization JourneyLegacy document extraction pipelines in financial environments accumulate silent technical debt: brittle OCR, manual rules, and absent traceability. In this article, I narrate the modernization journey to Bedrock Data Automation, covering architecture decisions, managed risks, and what genuinely changes in operations. The analysis is grounded in real patterns from critical financial systems, not lab demos.
- 4Expert 7 minGPT-5 vs Claude vs Nova on Bedrock: A Production Governance Bake-offWith GPT-5.5 and Codex landing on Amazon Bedrock, platform teams now face a genuine choice between three frontier model families within the same control plane. This analysis compares GPT-5.5, Claude 3.7 Sonnet, and Amazon Nova Pro through the lens of teams shipping AI into regulated production environments.
- 5Expert 9 minADR: Adopting Amazon Bedrock AgentCore in ProductionBedrock AgentCore promises to reduce the operational friction of running AI agents in production, but adopting any managed agent orchestration platform demands an explicit architectural decision. In this ADR, I document the forces that drove me to evaluate AgentCore, the alternatives considered, and the real consequences of each path.
- 6Expert 9 minEC2 G7e: Architecture Decision for Generative Video InferenceEC2 G7e instances arrive with NVIDIA L40S GPUs and promise to redefine cost-per-frame for generative video inference workloads. In this architecture decision record, I evaluate the forces that make this choice non-trivial, the failure patterns I have seen in production, and the configuration I would adopt in a financial-grade environment.
- 7Expert 8 minContract Intelligence on AWS: Field-Notes ArchitectureBuilding contract intelligence with generative AI goes far beyond wiring an LLM to PDFs. This article documents the architectural patterns, operational gotchas, and design decisions that separate an impressive PoC from a reliable system in financial-grade production.
- 8Expert 8 minAgentic RAG on AWS: Architecture Bake-Off for Financial-Grade PlatformsAgentic RAG has moved from lab experiment to platform requirement in financial environments that demand auditability, cost control, and predictable latency. In this article I compare four concrete architectural approaches on AWS, with real trade-offs, plausible numbers, and an unambiguous recommendation.
Deep-dive studies
teardownTeardown: Resilient Network Graphs and the Next-Generation AI NetworkAn in-depth architectural analysis of the resilient graph-based data center networks AWS is building to support AI workloads at scale — covering topology, congestion control, energy efficiency, and the trade-offs that define the next generation of cloud infrastructure.adrADR: AWS Transform & AI Agents vs Traditional Modernization FactoryThis ADR evaluates the decision to adopt AWS Transform (with AI agents for .NET, Mainframe, VMware, and custom code) versus a traditional human-engineering modernization factory, or a hybrid approach. The analysis covers regression risk, test coverage, code ownership, security, total cost, and change governance in an enterprise-scale modernization program.design-docDesign Doc: Continuous Evaluation Suite for Agents with Bedrock AgentCoreLLM agents in production silently degrade as models, tools, and prompts evolve — without a continuous evaluation discipline, regressions reach users before they are detected. This document proposes a complete offline and online evaluation architecture using Amazon Bedrock AgentCore, with versioned datasets, CI/CD quality gates, runtime signals, and systematic adversarial testing.design-docDesign Doc: LLM Observability — from GPU Utilization to Response QualityThis document proposes an end-to-end observability architecture for LLM inference platforms running on Amazon SageMaker AI and Amazon Bedrock, covering everything from hardware metrics (GPU utilization, memory) to semantic response quality, behavioral drift, and per-tenant cost. The design integrates CloudWatch, Amazon Managed Grafana, prompt-level tracing, and automated regression alarms, with clear separation of concerns across collection, storage, evaluation, and alerting layers.adrADR: Cognito Multi-Region for Resilient AuthenticationThis ADR examines when and how to adopt multi-region User Pool replication in Amazon Cognito to reduce authentication downtime on identity platforms with high-availability requirements. It covers regional failover, customer-managed KMS keys, user synchronization, session and token impact, custom domains, and customer experience, with explicit reasoning on operational and cost trade-offs.adrADR: OpenSearch Serverless vs Dedicated Vector Database for Agentic RAGThis ADR evaluates vector search infrastructure options for a multi-tenant agentic RAG platform on AWS, comparing OpenSearch Serverless, dedicated vector databases (Pinecone, pgvector), and a self-managed hybrid search layer. The decision weighs cost, p99 latency, permission-based filtering, incremental ingestion, and native Bedrock Knowledge Bases integration.