SkarpSkarp

Chapter 18 of 26

High-Performing Network Architectures: VPC, Load Balancing, and CloudFront

Network bottlenecks can derail even the best compute and storage plans; design VPC layouts, load balancers, and CloudFront distributions that keep latency low and throughput high.

27 min readen

Big Picture: Networking and the Performance Efficiency Pillar

Why Networking Performance Matters

Even perfectly tuned EC2 and RDS can feel slow if packets take long, congested paths. Here you will design VPCs, load balancers, and CloudFront to keep latency low and throughput high.

Well-Architected Context

The AWS Well-Architected Framework defines six pillars: Operational excellence, Security, Reliability, Performance efficiency, Cost optimization, Sustainability. We focus on performance efficiency here.

Performance Efficiency in Networking

In networking, performance efficiency means short, direct paths, avoiding unnecessary hops, offloading work like TLS and caching, and using global infrastructure to get closer to users.

Exam-Relevant Skills

For the Solutions Architect – Associate exam, you must reason about subnets, route tables, ALB vs NLB, and when CloudFront actually reduces latency and transfer cost.

High-Performing VPC Design: Subnets, AZs, and Routing

VPC Building Blocks

A VPC is your isolated AWS network. Key parts: subnets per AZ, route tables, internet gateway for public access, NAT gateway for outbound-only internet from private subnets.

High-Performance Layout

Use at least two public subnets (for ALBs and NATs) and two private subnets (for app and DB) across AZs. This improves resilience and spreads traffic.

Routing for Performance

Use VPC endpoints to keep AWS service traffic on the AWS backbone. Deploy one NAT gateway per AZ and route each private subnet to its local NAT for better throughput.

Inter-VPC Traffic

Use VPC peering or AWS PrivateLink for inter-VPC connectivity instead of hairpinning via the internet, reducing latency and avoiding extra hops.

Reducing Latency with VPC Endpoints and Private Connectivity

Types of VPC Endpoints

Gateway endpoints support S3 and DynamoDB via route table entries. Interface endpoints (PrivateLink) create ENIs with private IPs for many AWS and some SaaS services.

Performance Benefits

Endpoints keep traffic on the AWS backbone, avoid NAT and internet gateways, and reduce load on NAT gateways, improving latency and predictability.

Typical Exam Scenario

If private EC2 instances hit S3 heavily and NAT is a bottleneck, the best fix is usually to add a VPC gateway endpoint for S3, not to move instances to public subnets.

Regional Scope

Endpoints are created per VPC per region. Gateway endpoints attach to route tables, interface endpoints attach to subnets as ENIs with private IP addresses.

Elastic Load Balancing Options and Performance Trade-offs

ALB vs NLB vs GWLB

ALB is Layer 7 for HTTP/HTTPS and APIs. NLB is Layer 4 for ultra-low latency TCP/UDP/TLS. GWLB is for inserting virtual appliances like firewalls into traffic flows.

When to Use ALB

Use ALB for most web apps and APIs: it handles HTTP features, host and path routing, WebSockets, and integrates with WAF while scaling automatically.

When to Use NLB

Use NLB for extreme throughput, very low latency, or non-HTTP protocols. It supports static IPs and is ideal for gaming, IoT, or preserving client IP.

Cross-Zone Load Balancing

Cross-zone load balancing spreads traffic across all AZs, improving utilization and performance but potentially affecting data transfer costs.

Designing a High-Performance Load Balancing Layer

Scenario Overview

You have a multi-AZ web app: HTTPS users, private app servers, RDS, and WebSockets for notifications. You must design a high-performance load balancing layer.

Choosing the Load Balancer

Traffic is HTTP/HTTPS with WebSockets and path-based routing. An ALB in at least two public subnets is the best choice, offloading TLS from app servers.

Target Groups and Health Checks

Create a target group for private EC2 instances. Use a light health check path so unhealthy instances are removed fast without wasting resources.

Cross-AZ and Scaling

Enable cross-zone load balancing and tie the target group to an Auto Scaling group so new instances register automatically and load stays balanced.

CloudFront Fundamentals: Edge Locations, Caching, and Origins

What CloudFront Is

CloudFront is AWS’s CDN. It uses edge locations and Regional Edge Caches to serve content closer to users, reducing latency and offloading origins.

Core Components

Key parts: edge locations, Regional Edge Caches, origins such as S3 or ALB, and distributions that define behaviors, caching, and TLS settings.

How Caching Helps

Caching shortens network distance, offloads your origin, and uses optimized protocols like HTTP/2 or HTTP/3 between users and the edge.

Static vs Dynamic Content

Static files get long TTLs and high cache hit rates. Dynamic content can still go through CloudFront with selective caching or as a fast reverse proxy.

Tuning CloudFront for Performance: TTLs, Cache Keys, and Origins

TTL Basics

CloudFront uses minimum, default, and maximum TTLs to control how long objects stay cached. Longer TTLs boost performance but slow down content updates.

Designing the Cache Key

The cache key decides when two requests are treated as the same. Include only necessary query strings, headers, and cookies to maximize cache hits.

Choosing Origins

Place S3 origins in appropriate regions and ensure ALB and EC2 origins are multi-AZ so they can handle CloudFront’s traffic reliably.

Common Exam Trap

CloudFront plus ALB but still high latency for static files often means TTLs are too low or the cache policy varies on too many request attributes.

Design Exercise: Putting VPC, ELB, and CloudFront Together

Work through this thought exercise to connect all the pieces.

Scenario

You are designing a global e-commerce site. Requirements:

  • Users from North America, Europe, and Asia
  • Product images and static assets must load very quickly
  • Checkout API must be secure and low-latency
  • Application servers and databases should not be directly accessible from the internet

Mentally sketch the architecture and answer these guiding questions:

  1. VPC layout
  • How many AZs will you use in your primary region?
  • Which components go into public subnets vs private subnets?
  • Where do NAT gateways sit, and how many do you deploy?
  1. Load balancing
  • Which load balancer type do you put in front of the application servers? Why?
  • Do you enable cross-zone load balancing?
  1. CloudFront
  • What is your origin for static content (images, CSS, JS)?
  • What is your origin for dynamic API calls (checkout, cart)?
  • How would you set TTLs for static assets vs dynamic API responses?
  1. Optimization choices
  • Where can you use VPC endpoints to improve performance?
  • How does CloudFront reduce load on your ALB and EC2 instances?

Pause and actually answer in your own words. Then compare with this high-level reference design in your head:

  • Multi-AZ VPC with public subnets for ALB and NAT, private subnets for EC2 and RDS
  • CloudFront distribution with S3 (static) and ALB (dynamic) origins
  • Long TTLs for static assets, short TTLs or no caching for dynamic API
  • VPC gateway endpoint for S3 to keep internal S3 traffic off the NAT gateway.

Quick Check: VPC and Endpoints

Test your understanding of high-performing VPC design and endpoints.

Your private EC2 instances in two AZs frequently access S3 and DynamoDB and are experiencing higher latency and NAT gateway data processing costs. From a performance and cost perspective, what is the MOST appropriate change?

  1. Move the EC2 instances into public subnets with public IPs so they can access S3 and DynamoDB directly
  2. Increase the size and number of NAT gateways and add an additional internet gateway for redundancy
  3. Create VPC gateway endpoints for S3 and DynamoDB and update the private subnet route tables to use them
  4. Attach an additional security group to the NAT gateway allowing all outbound traffic
Show Answer

Answer: C) Create VPC gateway endpoints for S3 and DynamoDB and update the private subnet route tables to use them

Gateway endpoints for S3 and DynamoDB keep traffic on the AWS backbone, reduce NAT usage, lower latency, and cut NAT data processing costs. Moving instances to public subnets worsens security. Adding more NAT or security groups does not address the unnecessary hop through NAT and IGW.

Quick Check: Load Balancing and CloudFront

Test your understanding of ELB choices and CloudFront behavior.

A startup runs a public REST API over HTTPS. Latency is acceptable now, but they expect rapid growth and want to maintain performance while adding features like path-based routing and WebSockets. They also plan to serve static content globally with low latency. Which combination is MOST appropriate?

  1. Place a Network Load Balancer in front of the API servers and use S3 static website hosting without CloudFront
  2. Use an Application Load Balancer for the API and put CloudFront in front of the ALB and an S3 bucket for static assets
  3. Expose EC2 instances directly with Elastic IPs and use CloudFront only for static content from S3
  4. Use a Gateway Load Balancer in front of the API servers and CloudFront only for dynamic API calls
Show Answer

Answer: B) Use an Application Load Balancer for the API and put CloudFront in front of the ALB and an S3 bucket for static assets

An ALB supports HTTPS, path-based routing, and WebSockets for the API. CloudFront in front of the ALB and S3 provides global caching and protocol optimizations for both static and dynamic content. NLB is not needed for typical HTTP APIs, and exposing EC2 directly or using GWLB here is not best practice.

Key Term Flashcards: Networking for Performance

Use these flashcards to reinforce core networking performance concepts before moving on.

Performance efficiency pillar
The performance efficiency pillar focuses on the efficient use of computing resources to meet requirements and maintain that efficiency as demand changes and technologies evolve.
Public subnet
A subnet associated with a route table that has a route to an internet gateway, allowing resources in the subnet to communicate directly with the internet.
Private subnet
A subnet that does not have a direct route to an internet gateway. Instances typically reach the internet via a NAT gateway or NAT instance in a public subnet.
VPC gateway endpoint
A VPC endpoint type that uses route table entries to provide private connectivity to S3 or DynamoDB without requiring an internet gateway or NAT gateway.
VPC interface endpoint (PrivateLink)
A VPC endpoint type that creates elastic network interfaces with private IPs in your subnets to privately connect to supported AWS or SaaS services.
Application Load Balancer (ALB)
A Layer 7 load balancer optimized for HTTP/HTTPS and gRPC that supports advanced routing, TLS termination, and features like WebSockets and WAF integration.
Network Load Balancer (NLB)
A Layer 4 load balancer designed for ultra-low latency and high throughput for TCP, UDP, and TLS traffic, supporting static IPs per AZ.
CloudFront edge location
A globally distributed point of presence where CloudFront caches and serves content to users, reducing latency by shortening network distance.
CloudFront TTL
The time to live that controls how long an object stays in CloudFront caches before it is considered stale and potentially revalidated with the origin.
Cross-zone load balancing
An ELB feature that distributes traffic evenly across all registered targets in all enabled AZs, rather than only within the AZ of the load balancer node.

Key Terms

VPC
A Virtual Private Cloud is an isolated virtual network in AWS where you can launch resources such as EC2 instances, with full control over IP addressing, subnets, and routing.
Subnet
A segment of a VPC's IP address range that resides in a single Availability Zone and can be designated as public or private via its route table.
Cache key
The set of request attributes (such as path, query string, headers, cookies) that CloudFront uses to uniquely identify a cached object.
NAT gateway
A managed Network Address Translation service that enables instances in a private subnet to connect to the internet or other AWS services, but prevents the internet from initiating a connection with those instances.
Route table
A set of rules (routes) that determine where network traffic from your subnets or gateways is directed.
VPC endpoint
A service that enables private connections between your VPC and supported AWS services or VPC endpoint services without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect.
Edge location
A site that CloudFront uses to cache and deliver content to users with lower latency.
Internet gateway
A horizontally scaled, redundant, and highly available VPC component that allows communication between your VPC and the internet.
Amazon CloudFront
A fast content delivery network service that securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds.
TTL (Time to live)
The amount of time that a DNS record or cached object is allowed to be stored before it should be discarded or refreshed.
Network Load Balancer
An ELB type that operates at the transport layer (Layer 4), handling millions of requests per second with ultra-low latency for TCP, UDP, and TLS traffic.
Application Load Balancer
An ELB type that operates at the application layer (Layer 7), providing advanced request routing for HTTP, HTTPS, and gRPC traffic.
Elastic Load Balancing (ELB)
A managed service that automatically distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself