Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

AI & AgentsTrend Briefing

Graviton4 R8g: Global Expansion and What Changes for Memory-Intensive Architectures

Jun 27, 2026 8 minadvanced AI-assisted

Listen to article

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

AI & AgentsTrend Briefing

40%

Gain on relational databases

R8g vs R7g (Graviton4 vs Graviton3) — official AWS benchmark

45%

Gain on large Java applications

Relevant for financial middleware and JVM-based rules engines

More vCPU and memory vs R7g

Up to 48xlarge with 1.5 TB RAM — real node consolidation

fernando.moretes.com

AWS has expanded EC2 R8g Graviton4 instances to five new regions — Thailand, New Zealand, South Africa, Italy, and Canada West — completing a global rollout that reshapes the cost-performance equation for memory-intensive workloads outside major hubs. With up to 1.5 TB RAM, 48xlarge sizing, and 40% database gains over Graviton3, this is not an incremental upgrade. It is a platform maturity signal that changes architecture decisions in regions where instance choice was previously constrained.

On June 26, 2026, AWS announced the availability of EC2 R8g instances — powered by the Graviton4 processor — in five new regions: Asia Pacific (Thailand), Asia Pacific (New Zealand), Africa (Cape Town), Europe (Milan), and Canada West (Calgary). Combined with the March 2026 expansion to UAE, Mexico, and Zurich, R8g now covers a set of regions that, 18 months ago, operated exclusively on x86 or Graviton3 families. For architects designing financial systems, data platforms, and distributed cache infrastructure outside traditional hubs like us-east-1 or eu-west-1, this changes the equation — not because of chip hype, but because of the combination of maximum capacity (1.5 TB, 48xlarge), network throughput (50 Gbps), and the ARM64 ecosystem maturity that arrived late to these regions.

R8g vs R7g: What the Numbers Actually Mean

40%

Gain on relational databases

R8g vs R7g (Graviton4 vs Graviton3) — official AWS benchmark

45%

Gain on large Java applications

Relevant for financial middleware and JVM-based rules engines

More vCPU and memory vs R7g

Up to 48xlarge with 1.5 TB RAM — real node consolidation

50 Gbps

Enhanced networking bandwidth

Critical for synchronous replication and distributed cache traffic

The Real Signal: Platform Maturity in Peripheral Regions

The R8g expansion is not a typical launch announcement. It is a platform maturity indicator for regions that historically receive new instance types 12 to 24 months behind us-east-1. The fact that Cape Town, Milan, Calgary, and Bangkok are receiving R8g in June 2026 — less than two years after the initial Graviton4 launch — signals that AWS is accelerating global capacity parity.

For architects designing systems in regulated regions, this has direct implications. Many financial and healthcare workloads requiring data residency in specific jurisdictions — such as LGPD in Brazil, POPIA in South Africa, or PDPA in Thailand — were until now confined to instance families with lower memory headroom or inferior cost efficiency. The arrival of R8g in these regions means the argument of 'staying in us-east-1 for capacity reasons' loses traction as an architectural rationale.

The rollout pattern itself is informative: R8gd (with local NVMe) expanded to additional regions in March 2026, and the standard R8g followed in waves. This suggests AWS is building physical capacity in these regions before enabling the types — which implies that Spot and Reserved Instance inventory will take several months to mature. Architects planning migrations to these regions should factor this RI availability timeline into their long-term cost model.

Financial Workloads and the Case for Node Consolidation with R8g

In enterprise-grade financial environments, the highest operational cost workloads are rarely compute-intensive — they are memory-intensive: OLTP databases with large buffer pools (PostgreSQL shared_buffers, Oracle SGA), distributed cache engines (Redis/Valkey, Memcached), and streaming platforms with Kafka brokers that hold large data volumes in memory for short-term retention.

The r8g.48xlarge with 1.5 TB RAM changes the consolidation calculus concretely. Consider a Kafka MSK cluster with r7g.4xlarge brokers (128 GB each): to support 500 GB of in-memory retention per broker, you would need four nodes with minimal margin. With r8g.12xlarge (384 GB), you achieve the same result with two nodes, reducing coordination overhead, per-node licensing cost (in software that charges per host), and cluster failure surface. The 40% database throughput gain translates directly into lower P99 latency for ACID transactions — which in payment systems means the difference between 50ms and 80ms SLOs at the 99th percentile.

There is an important trade-off here that must be named: aggressive consolidation increases the blast radius of failures. An r8g.48xlarge that fails takes 1.5 TB of state with it. The correct architectural response is not to avoid large instances, but to ensure that replication and failover design is proportional — Multi-AZ with synchronous replicas, RTO tested regularly with Game Days, and EBS stall metric monitoring (VolumeQueueLength, BurstBalance) as early degradation signals.

Reference Architecture: Financial Data Platform with R8g in a Regulated Region

High-memory workload flow in a regulated region (e.g. af-south-1 / Cape Town) using R8g as the compute backbone, with ingestion, processing, cache, and observability layers.

🌐 Ingestion Layer

API Gateway · REST/HTTP
NLB · TCP/TLS

⚙️ Streaming & Processing

MSK Kafka · r8g.4xlarge brokers · 128GB x3
Flink on EKS · r8g.2xlarge nodes · stream processing

🧠 Memory-Intensive Compute

RDS PostgreSQL · r8g.8xlarge · 256GB shared_buffers
ElastiCache Valkey · r8g.4xlarge · 128GB cache
App Tier EKS · r8g.2xlarge · JVM heap 96GB

🔒 Security & Compliance

KMS CMK · AES-256 at-rest
Security Groups · Zero Trust egress

📊 Observability

CloudWatch · VolumeQueueLength · BurstBalance
OpenTelemetry · Collector sidecar · P99 latency

What Changes for Architects with R8g Expansion

Data residency without capacity sacrifice: Regions like Cape Town (POPIA) and Bangkok (PDPA) now have access to instances with up to 1.5 TB RAM — removing the justification for using us-east-1 as a proxy for regulated workloads.

ARM64 migration with lower risk: The Java/JVM ecosystem on ARM64 is mature (OpenJDK 21 LTS, Spring Boot 3.x, Quarkus). The 45% gain on large Java apps is verifiable with internal JMH benchmarks before migrating production.

EBS io2 Block Express as a natural complement: With 40 Gbps EBS bandwidth, R8g supports io2 Block Express volumes with up to 256,000 IOPS per volume — relevant for databases that exceed the buffer pool and need deterministic I/O.

Bare metal for ultra-low latency workloads: The two R8g bare metal sizes eliminate hypervisor overhead — critical for trading engines and order matching systems that measure latency in microseconds.

RI and Spot inventory maturing: New regions have initially limited Reserved Instance inventory. Planning with On-Demand for the first 3-6 months and migrating to 1-year no-upfront RI as inventory stabilizes is the prudent approach.

Nitro System as security baseline: Nitro's virtualization offload to dedicated hardware reduces side-channel attack surface (Spectre/Meltdown mitigations are more efficient) and enables nitro-enclaves for sensitive data processing in isolated memory.

Migration Strategy: From x86 to Graviton4 in Financial Environments

Migrating financial workloads to ARM64 has a different risk profile than conventional web applications. The primary failure vector is not application code — most modern frameworks compile natively to ARM64 without modification. The real risk lies in three layers: native libraries (JNI, BLAS, LAPACK for quantitative models), database drivers with x86-specific SIMD optimizations, and observability tooling that still has agents with x86-only binaries.

The approach I recommend is a three-phase validation pipeline. In the first phase, use the AWS Graviton Fast Start and Porting Advisor for Graviton to perform a static dependency scan — the Porting Advisor identifies libraries with native x86 code and suggests ARM64 alternatives. In the second phase, run realistic load benchmarks (not synthetic) on r8g with anonymized production data: for databases, this means pg_bench with the real schema and query distribution from your AWR/pg_stat_statements. For Kafka, it means reproducing the production profile with kafka-producer-perf-test with the same message size and compression ratio. In the third phase, use canary deployment with feature flags — route 5% of traffic to R8g nodes and compare P50/P95/P99 side by side in the same CloudWatch dashboard with custom metrics segmented by instance type.

An operational detail that is frequently overlooked: the Linux scheduler behaves differently on ARM64 for NUMA workloads on very large instances. For r8g.24xlarge and above, it is worth explicitly validating NUMA affinity with numactl and monitoring numa_miss in /proc/vmstat as an indicator of memory locality degradation.

Observability and Security: What Changes with Graviton4 in Production

Adopting R8g in production requires specific adjustments to the observability strategy. The first point is that CloudWatch CPU metrics (CPUUtilization) do not adequately capture the behavior of memory-bound workloads — a database with a 200 GB buffer pool may show 30% CPU while completely bottlenecked on memory I/O. The correct signals for R8g are: for RDS, FreeableMemory with an alarm when it drops below 10% of total RAM; for ElastiCache, CurrConnections and Evictions as memory pressure indicators; for EKS, container_memory_working_set_bytes via kube-state-metrics, not memory_usage which includes page cache.

On the security side, the Nitro System brings concrete benefits that deserve explicit configuration. AWS Nitro Enclaves are available on R8g instances and allow creating isolated execution environments with dedicated memory — relevant for card data processing (PCI DSS) or cryptographic keys in software HSMs. To enable this, the EnclaveOptions.Enabled=true parameter must be set in the launch template, and enclave size is allocated from the instance's total memory (typically 25-50% for sensitive processing workloads).

From an IAM and access control perspective, the practice I recommend for R8g instances in financial environments is to use instance profiles with IAM conditions based on aws:RequestedRegion to ensure that instance credentials cannot be used outside the regulated region — a defense-in-depth measure against credential exfiltration. Combined with VPC endpoints for all consumed AWS services (S3, KMS, SSM, CloudWatch), you eliminate sensitive data traffic through the internet gateway and reduce the network attack surface.

The Case for R8g Bare Metal in Trading Engines

The two R8g bare metal sizes (r8g.metal-24xl and r8g.metal-48xl) completely eliminate Nitro hypervisor overhead — which is already minimal, but measurable at P99.9 latency. For order matching systems and trading engines targeting sub-500 microsecond latency, bare metal on Graviton4 offers a more deterministic latency profile than equivalent virtualized instances. The additional cost of bare metal vs. the equivalent virtualized instance is typically 5-8% — a favorable trade-off when the latency SLO is contractual.

Common Anti-Patterns in R8g Adoption

Migrating to r8g.48xlarge without redesigning failover: Larger instances require proportional HA strategies. Keeping the same HA design from smaller instances with a 1.5 TB node is a recipe for unacceptable RTO.
Assuming all ARM64 container images are available: Multi-arch Docker images are not universal. Validating the manifest of all production images with docker manifest inspect before migrating avoids runtime surprises.
Using CPU metrics as a health proxy for memory-bound workloads: On high-memory instances, low CPU does not mean a healthy system. Monitoring FreeableMemory, swap usage, and NUMA miss rates is mandatory.
Purchasing Reserved Instances immediately in new regions: RI inventory in newly expanded regions may be limited. Waiting 90-120 days and using Cost Explorer to validate availability before committing capital to 3-year RIs.
Ignoring the Porting Advisor and assuming full compatibility: Especially for workloads with native dependencies (C extensions in Python, JNI in Java, BLAS libraries for ML), the Porting Advisor is a non-optional step.

R8g vs R7g vs R6i: Instance Decision for Financial Workloads

	Criterion	R8g (Graviton4)	R7g (Graviton3)	R6i (Intel Ice Lake)
Maximum memory	1.5 TB (48xlarge)	512 GB (16xlarge)	1.5 TB (48xlarge)	—
Relational DB gain	+40% vs Graviton3	Baseline	~-10% vs R8g (estimated)	—
x86 compatibility	Requires ARM64 validation	Requires ARM64 validation	Drop-in replacement	—
Relative cost (estimated)	~20-25% lower than R6i	~30-35% lower than R6i	Reference	—
Bare metal available	Yes (2 sizes)	Yes	Yes	—
Availability in new regions	Active expansion (Jun 2026)	Broad availability	Broad availability	—

R8g Through the AWS Well-Architected Lens

Security

Nitro System with virtualization offload to dedicated hardware reduces side-channel attack surface. Enable Nitro Enclaves for sensitive data processing (PCI, PII). Use VPC endpoints to eliminate sensitive traffic through the IGW. IAM conditions with aws:RequestedRegion for credential confinement to the regulated region.

Reliability

Large instances increase blast radius — compensate with mandatory Multi-AZ, synchronous replicas, and regular Game Days to test real RTO. Monitor BurstBalance and VolumeQueueLength on EBS as early degradation signals. Define P99 latency SLOs with CloudWatch alerts before migrating production.

Performance efficiency

Validate NUMA affinity for 24xlarge and above with numactl. Use io2 Block Express for database workloads that exceed the buffer pool. Realistic benchmarks with anonymized production data before migrating — do not rely solely on AWS synthetic benchmarks.

Sustainability

Graviton4 offers better energy efficiency per computational operation vs. equivalent x86. Node consolidation with R8g reduces total instance count, lowering energy consumption and carbon footprint — aligned with corporate sustainability goals and ESG reporting.

Curator's Note

Senior Solutions Architect

In my experience with financial workload migrations to Graviton, the biggest risk is not technical — it is organizational: teams that validate ARM64 only in staging with synthetic load and discover native library compatibility issues in production at 2 AM. What works is treating the migration as an engineering project with three mandatory artifacts: a Porting Advisor report with all risks mapped, a realistic load benchmark with documented P99, and an ADR with the explicit blast radius vs. cost savings trade-off. The R8g expansion to regions like Cape Town and Bangkok is genuinely strategic for architectures that need data residency with top-tier capacity — but the value only materializes with validation discipline, not migration optimism.

Verdict: Adopt with Discipline, Not Haste

The EC2 R8g Graviton4 expansion to regions like Cape Town, Bangkok, Milan, and Calgary is a platform maturity signal that changes real architectural decisions. For financial workloads with data residency constraints, R8g eliminates the 'insufficient capacity in regulated regions' argument. For any high-memory workload — OLTP databases, distributed caches, Kafka brokers — the 40% database throughput gain and 45% Java gain over Graviton3 is verifiable and materially relevant. My recommendation: begin immediate evaluation for Java and PostgreSQL workloads with the Porting Advisor and realistic benchmarks. Plan canary migration with 5-10% of traffic before committing full production. Wait 90-120 days before purchasing Reserved Instances in newly expanded regions to ensure inventory stability. Document the decision with an ADR that explicitly includes the blast radius vs. cost savings trade-off. R8g is not hype — it is the natural evolution of an ARM64 platform that has matured enough to be the default choice for new high-memory infrastructure projects on AWS.

References

Amazon EC2 R8g Instances — AWS What's New (Jun 26, 2026)Amazon EC2 R8g Instances — Instance Types Page Amazon EC2 R8g Instances — Previous Regional Expansion (Mar 6, 2026)Amazon EC2 R8gd Instances — Additional Regions (Mar 26, 2026)AWS Graviton Fast Start Program Porting Advisor for Graviton — GitHub AWS Nitro System AWS Nitro Enclaves

#graviton4#ec2-r8g#memory-optimized#aws-regions#financial-grade#cost-optimization#arm64#database-performance

Liked this? Get the next one.

Architecture, AWS, AI and market deep dives — straight to your inbox. Free.

No spam · unsubscribe anytime

Analyzed source: Amazon EC2 R8g instances now available in additional regions

Ask Fernando about this

Get a focused answer about this article from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

Keep reading

AI & AgentsEC2 G7e: Architecture Decision for Generative Video InferenceEC2 G7e instances arrive with NVIDIA L40S GPUs and promise to redefine cost-per-frame for generative video inference workloads. In this architecture decision record, I evaluate the forces that make this choice non-trivial, the failure patterns I have seen in production, and the configuration I would adopt in a financial-grade environment.Read AI & AgentsAmazon Bedrock AgentCore Harness: From Idea to Production-Grade AgentAgentCore Harness reached GA in June 2026 as a managed abstraction that collapses the LLM agent control plane into two API calls. In this article, I analyze how the harness works internally, where it fails, and what architects of financial-grade systems need to understand before putting it into production.Read AI & AgentsManaged Syslog Ingestion in CloudWatch: Anatomy of a PatternCloudWatch Logs now accepts syslog directly via VPC endpoint — no agents, with automatic parsing of RFC 5424, RFC 3164, and Cisco FTD/ASA. This shifts the log collection pattern for network devices and Linux servers in regulated environments. In this article, I dissect the pattern anatomy, its real limits, and the anti-patterns I've seen burn teams in production.Read

Architecture newsletter

Architecture intelligence, in your inbox

Curated signals and original analysis on AWS, AI, distributed systems and the market — the way a solutions architect reads them.

Curated AWS · AI · architecture · market signals
New architecture studies & deep-dives when they ship
Sharp summaries — depth without the noise
No spam · double opt-in · unsubscribe anytime

AI & AgentsTrend Briefing

Graviton4 R8g: Global Expansion and What Changes for Memory-Intensive Architectures

Jun 27, 2026 8 minadvanced AI-assisted

Listen to article

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

AI & AgentsTrend Briefing

40%

Gain on relational databases

R8g vs R7g (Graviton4 vs Graviton3) — official AWS benchmark

45%

Gain on large Java applications

Relevant for financial middleware and JVM-based rules engines

More vCPU and memory vs R7g

Up to 48xlarge with 1.5 TB RAM — real node consolidation

fernando.moretes.com

R8g vs R7g: What the Numbers Actually Mean

40%

Gain on relational databases

R8g vs R7g (Graviton4 vs Graviton3) — official AWS benchmark

45%

Gain on large Java applications

Relevant for financial middleware and JVM-based rules engines

More vCPU and memory vs R7g

Up to 48xlarge with 1.5 TB RAM — real node consolidation

50 Gbps

Enhanced networking bandwidth

Critical for synchronous replication and distributed cache traffic

The Real Signal: Platform Maturity in Peripheral Regions

Financial Workloads and the Case for Node Consolidation with R8g

Reference Architecture: Financial Data Platform with R8g in a Regulated Region

High-memory workload flow in a regulated region (e.g. af-south-1 / Cape Town) using R8g as the compute backbone, with ingestion, processing, cache, and observability layers.

🌐 Ingestion Layer

API Gateway · REST/HTTP
NLB · TCP/TLS

⚙️ Streaming & Processing

MSK Kafka · r8g.4xlarge brokers · 128GB x3
Flink on EKS · r8g.2xlarge nodes · stream processing

🧠 Memory-Intensive Compute

RDS PostgreSQL · r8g.8xlarge · 256GB shared_buffers
ElastiCache Valkey · r8g.4xlarge · 128GB cache
App Tier EKS · r8g.2xlarge · JVM heap 96GB

🔒 Security & Compliance

KMS CMK · AES-256 at-rest
Security Groups · Zero Trust egress

📊 Observability

CloudWatch · VolumeQueueLength · BurstBalance
OpenTelemetry · Collector sidecar · P99 latency

What Changes for Architects with R8g Expansion

Migration Strategy: From x86 to Graviton4 in Financial Environments

Observability and Security: What Changes with Graviton4 in Production

The Case for R8g Bare Metal in Trading Engines

Common Anti-Patterns in R8g Adoption

Migrating to r8g.48xlarge without redesigning failover: Larger instances require proportional HA strategies. Keeping the same HA design from smaller instances with a 1.5 TB node is a recipe for unacceptable RTO.
Assuming all ARM64 container images are available: Multi-arch Docker images are not universal. Validating the manifest of all production images with docker manifest inspect before migrating avoids runtime surprises.
Using CPU metrics as a health proxy for memory-bound workloads: On high-memory instances, low CPU does not mean a healthy system. Monitoring FreeableMemory, swap usage, and NUMA miss rates is mandatory.
Purchasing Reserved Instances immediately in new regions: RI inventory in newly expanded regions may be limited. Waiting 90-120 days and using Cost Explorer to validate availability before committing capital to 3-year RIs.
Ignoring the Porting Advisor and assuming full compatibility: Especially for workloads with native dependencies (C extensions in Python, JNI in Java, BLAS libraries for ML), the Porting Advisor is a non-optional step.

R8g vs R7g vs R6i: Instance Decision for Financial Workloads

	Criterion	R8g (Graviton4)	R7g (Graviton3)	R6i (Intel Ice Lake)
Maximum memory	1.5 TB (48xlarge)	512 GB (16xlarge)	1.5 TB (48xlarge)	—
Relational DB gain	+40% vs Graviton3	Baseline	~-10% vs R8g (estimated)	—
x86 compatibility	Requires ARM64 validation	Requires ARM64 validation	Drop-in replacement	—
Relative cost (estimated)	~20-25% lower than R6i	~30-35% lower than R6i	Reference	—
Bare metal available	Yes (2 sizes)	Yes	Yes	—
Availability in new regions	Active expansion (Jun 2026)	Broad availability	Broad availability	—

R8g Through the AWS Well-Architected Lens

Security

Reliability

Performance efficiency

Sustainability

Curator's Note

Senior Solutions Architect

Verdict: Adopt with Discipline, Not Haste

References

#graviton4#ec2-r8g#memory-optimized#aws-regions#financial-grade#cost-optimization#arm64#database-performance

Liked this? Get the next one.

Architecture, AWS, AI and market deep dives — straight to your inbox. Free.

No spam · unsubscribe anytime

Analyzed source: Amazon EC2 R8g instances now available in additional regions

Ask Fernando about this

Get a focused answer about this article from my AI assistant, grounded in my work.

Join the conversation

Verify your email to join in — you'll also get the newsletter. No password.

Keep reading

Architecture newsletter

Architecture intelligence, in your inbox

Curated signals and original analysis on AWS, AI, distributed systems and the market — the way a solutions architect reads them.

Curated AWS · AI · architecture · market signals
New architecture studies & deep-dives when they ship
Sharp summaries — depth without the noise
No spam · double opt-in · unsubscribe anytime