SkarpSkarp

Chapter 15 of 26

High-Performance Networking and Content Delivery with Amazon VPC and CloudFront

Latency, throughput, and global reach all come down to networking and content delivery. This module connects Amazon VPC designs with Amazon CloudFront and Route 53 to optimize user experience worldwide.

27 min readen

Module Overview: Networking, Latency, and the Exam Context

Where This Module Fits

You will connect Amazon VPC, CloudFront, and Route 53 to design low-latency, high-throughput architectures that serve users globally and show up heavily on the Solutions Architect – Associate exam.

Well-Architected Context

This module is mainly about the performance efficiency pillar: "The performance efficiency pillar focuses on the efficient use of computing resources to meet requirements and maintain that efficiency as demand changes and technologies evolve."

What You Will Do

You will design high-performing VPC networks, understand EC2 and load balancer bandwidth, configure CloudFront with S3/ALB/custom origins, tune caching, and use Route 53 latency-based routing for global performance.

High-Performance VPC Design: Subnets, AZs, and Routing

Multi-AZ and Subnets

Design your VPC with at least two AZs. Use public subnets for load balancers and NAT gateways, and private subnets for app servers and databases. This reduces latency and improves availability.

Routing and NAT Bottlenecks

Each private subnet should route internet-bound traffic to a NAT gateway in its own AZ. This avoids cross-AZ hops and reduces the chance that a single NAT gateway becomes a throughput bottleneck.

Subnet Size and Scaling

High-throughput workloads often scale out. If subnets are too small, you can run out of IP addresses and block scaling. In exam scenarios, /24 subnets are a safe, scalable default.

Bandwidth and Throughput: EC2, ENIs, and Load Balancers

EC2 Network Limits

Each EC2 instance has a maximum network bandwidth based on its type and size. ENA-enabled instances offer higher throughput, but a single ENI still caps how much traffic it can handle.

Load Balancers and Scaling

ALB and NLB scale automatically with traffic. The usual bottleneck is not the load balancer but the backend EC2 instances, which may run out of CPU or network bandwidth.

Exam Clues for Bottlenecks

If a question mentions timeouts or high latency at peak times, look for under-sized EC2 instances, single-AZ deployments, or NAT gateways handling too much outbound traffic.

CloudFront Fundamentals: Edge Locations, POPs, and Origins

What CloudFront Is

CloudFront is AWS's CDN. Users connect to nearby edge locations, which cache content so they do not always have to reach your origin in a single Region.

Key Components

A CloudFront distribution defines origins (S3, ALB/EC2, custom), cache behaviors, and security settings. Edge locations and regional edge caches store content close to users.

Why It Is Fast

CloudFront reduces latency by serving from nearby caches, offloading origin traffic, and using AWS's global network with modern protocols like HTTP/2 and HTTP/3.

Designing a CloudFront Distribution with S3 and ALB Origins

Scenario Overview

You have an e-commerce site in us-east-1 with static assets in S3 and dynamic content behind an ALB. Users are global and you need lower latency and less origin load.

Two Origins, One Distribution

Configure CloudFront with two origins: S3 for `/static/` and `/images/`, and the ALB for dynamic paths like `/api/` and `/checkout/`. Use path-based cache behaviors.

Security and DNS

Enable HTTPS with an ACM certificate in us-east-1, set viewer policy to redirect HTTP to HTTPS, and map your custom domain to CloudFront using a Route 53 alias record.

CloudFront Caching, TTLs, and Dynamic Content

Cache Keys and TTLs

CloudFront caches objects based on URL plus selected headers, cookies, and query strings. TTLs control how long objects stay in cache before CloudFront checks the origin again.

Static vs Dynamic

Static files can use long TTLs and versioned filenames. Dynamic or personalized content usually has short TTLs or is not cached, and often relies on custom cache and origin request policies.

Invalidation and Exam Trap

When content must change before TTL expiry, use CloudFront invalidations or new object versions. For urgent removal of sensitive data, invalidation is the key exam answer.

Thought Exercise: Tuning TTLs for Different Paths

Imagine you are designing CloudFront cache behaviors for a news website with these URL patterns:

  1. `/assets/*` – CSS, JS, logos, fonts. Deployed with versioned filenames (for example, `main.v42.css`).
  2. `/images/*` – Article images that rarely change after publication.
  3. `/api/headlines` – Returns the latest headlines; updated every 10 seconds.
  4. `/user/profile` – Personalized profile page, requires authentication.

Your goals:

  • Minimize latency for users worldwide.
  • Avoid overloading the origin.
  • Keep data fresh where it matters.

Task 1: Propose TTLs

For each path, decide on an approximate default TTL and whether CloudFront should cache it at all.

  • `/assets/*`: TTL? Cache? Why?
  • `/images/*`: TTL? Cache? Why?
  • `/api/headlines`: TTL? Cache? Why?
  • `/user/profile`: TTL? Cache? Why?

Task 2: Cache key design

For each path, decide what should be part of the cache key:

  • Query strings? Cookies? Headers (for example, Authorization, Accept-Language)?

Write down your answers, then compare to this reference solution:

Reference solution (high level)

  • `/assets/*`: Long TTL (days). Cache aggressively. Cache key usually path only.
  • `/images/*`: Medium-long TTL (hours or more). Cache. Key usually path, maybe language.
  • `/api/headlines`: Short TTL (5–15 seconds). Cache GETs. Include query params that affect content.
  • `/user/profile`: Do not cache (or use very short TTL with `Authorization` header in cache key). Include auth and user-identifying info if cached, but usually better to skip caching.

Route 53 Latency-Based Routing and Global Architectures

What Latency-Based Routing Does

Route 53 latency-based routing returns the DNS record for the Region that offers the lowest latency to the user, based on AWS measurements between Regions and user locations.

CloudFront vs Regional Endpoints

For websites, point Route 53 to a single CloudFront distribution. For latency-sensitive APIs or regional data, create regional ALBs or API endpoints and use latency-based routing.

Exam Patterns

Single Region plus CloudFront is simple and common. Multi-Region active-active with latency-based routing is used when the question stresses global low-latency dynamic access and resilience.

Putting It Together: A Global, High-Performance Web Application

Requirements Recap

You need a globally accessible web app with low latency, high availability, and cost-aware design. Static and dynamic content must both perform well.

Core Architecture

Deploy a multi-AZ VPC in us-east-1, with private EC2 behind an ALB, static content in S3, and a CloudFront distribution using both as origins with path-based cache behaviors.

DNS and Multi-Region Option

Point `www.example.com` to CloudFront via a Route 53 alias. For even lower dynamic latency worldwide, add a second Region and use latency-based routing between regional endpoints.

Quiz 1: CloudFront and Caching Behavior

Check your understanding of CloudFront origins and caching.

Your company hosts an API in us-east-1 behind an Application Load Balancer. Global clients experience high latency. You add CloudFront with the ALB as origin. Which configuration best reduces latency while avoiding caching personalized responses that depend on an Authorization header?

  1. Configure CloudFront to cache all headers and set a long default TTL for all paths.
  2. Create a cache policy that includes the Authorization header in the cache key and set default TTL to 0 seconds for API paths.
  3. Disable caching entirely on the distribution so CloudFront only accelerates TCP connections.
  4. Use an S3 bucket as origin instead of the ALB so CloudFront can cache all responses.
Show Answer

Answer: B) Create a cache policy that includes the Authorization header in the cache key and set default TTL to 0 seconds for API paths.

You want CloudFront to accelerate and optionally cache some API responses, but not mix responses across users. Including the Authorization header in the cache key and setting TTL to 0 for sensitive paths ensures CloudFront will always forward the request to the origin while still supporting performance features like HTTP/2 and TLS termination. Option 1 risks serving one user's data to another. Option 3 removes caching benefits entirely. Option 4 does not apply because the API is dynamic and not stored in S3.

Quiz 2: Route 53 Latency-Based Routing vs CloudFront

Check your understanding of Route 53 and global performance.

You operate a video streaming platform. Video files are stored in S3 in us-west-2. Viewers are in North America, Europe, and Asia. Which design gives the best global performance with the least operational complexity?

  1. Create separate S3 buckets in each Region and use Route 53 latency-based routing directly to the regional S3 endpoints.
  2. Keep a single S3 bucket in us-west-2, place CloudFront in front of it, and point your domain to the CloudFront distribution using a Route 53 alias record.
  3. Create EC2 instances in each Region that proxy requests to the S3 bucket, and use Route 53 weighted routing between the instances.
  4. Create two CloudFront distributions, one for North America and one for the rest of the world, and manually direct users based on their country.
Show Answer

Answer: B) Keep a single S3 bucket in us-west-2, place CloudFront in front of it, and point your domain to the CloudFront distribution using a Route 53 alias record.

Using a single S3 bucket with a CloudFront distribution in front of it is the standard pattern for global content delivery. CloudFront automatically uses its global edge network to cache content close to users, and Route 53 simply points the domain to the distribution. The other options add unnecessary complexity or do not leverage CloudFront's caching effectively.

Key Term Review

Flip through these cards to reinforce core concepts.

Amazon CloudFront
A global content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to users with low latency and high transfer speeds using a worldwide network of edge locations.
Origin (CloudFront)
The source location where CloudFront retrieves your content from when there is a cache miss, such as an S3 bucket, an Application Load Balancer, an EC2 instance, or another HTTP/HTTPS server.
TTL (Time To Live) in caching
The amount of time that a cached object is considered valid and can be served from cache before the cache must revalidate or refetch the object from the origin.
Route 53 Latency-Based Routing
A DNS routing policy that directs user requests to the AWS Region that provides the lowest latency, based on measurements between AWS Regions and the user's DNS resolver location.
Public subnet
A subnet within a VPC that has a route to an Internet Gateway, allowing resources in the subnet to communicate directly with the internet.
Private subnet
A subnet within a VPC that does not have a direct route to an Internet Gateway; instances in the subnet typically reach the internet via a NAT gateway or NAT instance.
Application Load Balancer (ALB)
A Layer 7 load balancer that routes HTTP/HTTPS traffic based on advanced request-level information such as host, path, headers, and query strings.
Network Load Balancer (NLB)
A Layer 4 load balancer that handles millions of requests per second with ultra-low latency, routing TCP, UDP, and TLS traffic based on IP protocol data.
Regional edge cache (CloudFront)
A larger cache location between CloudFront edge locations and your origin that helps improve cache hit ratio and reduce origin load, especially for less frequently accessed objects.
CloudFront invalidation
A mechanism to remove objects from CloudFront edge caches before their TTL expires, ensuring that subsequent requests are fetched from the origin.

Key Terms

Origin
In CloudFront, the source location where content is stored and from which CloudFront retrieves objects when they are not in the cache.
Amazon VPC
A logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define, including IP ranges, subnets, route tables, and gateways.
Amazon Route 53
A highly available and scalable cloud Domain Name System (DNS) web service that routes end-user requests to internet applications running on AWS or elsewhere.
Amazon CloudFront
A global content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to users with low latency and high transfer speeds using a worldwide network of edge locations.
TTL (Time To Live)
A value that specifies how long a DNS resolver or cache should consider data valid before it must be refreshed from the authoritative source.
Regional edge cache
A CloudFront cache layer between edge locations and your origin that helps improve cache efficiency and reduce origin load by storing content that may not be frequently requested at every edge location.
Latency-based routing
A Route 53 routing policy that directs traffic to the Region that provides the best latency for the user, based on AWS measurements.
Network Load Balancer (NLB)
A Layer 4 load balancer designed to handle millions of requests per second while maintaining ultra-low latency, routing traffic based on IP protocol data.
Application Load Balancer (ALB)
A Layer 7 load balancer that routes HTTP/HTTPS traffic based on request content such as host, path, headers, and query strings.
Elastic Network Interface (ENI)
A virtual network interface that you can attach to an instance in a VPC, with its own IP addresses, security groups, and network traffic limits.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself