Auto Scaling group (ASG)

A service that manages a fleet of EC2 instances, maintaining a specified minimum, desired, and maximum capacity, and optionally scaling capacity automatically based on policies and health checks.

Application Load Balancer (ALB)

A Layer 7 load balancer that distributes HTTP/HTTPS and gRPC traffic, supports advanced routing (host/path-based), and integrates with target groups and health checks.

Target tracking scaling policy

An Auto Scaling policy type where you define a target value for a metric (such as CPU utilization), and the ASG automatically adjusts capacity to keep the metric near that value.

An Auto Scaling feature that changes the minimum, maximum, or desired capacity of an Auto Scaling group at specific times based on a schedule.

What is Spot Instance?

A discounted EC2 capacity type that can be interrupted by AWS; often used in Auto Scaling groups for batch or fault-tolerant workloads to reduce cost.

What is Scheduled scaling?

An Auto Scaling feature that changes the minimum, maximum, or desired capacity of an Auto Scaling group at specific times based on a schedule.

What is Health check (ELB)?

A periodic test performed by a load balancer to determine whether a registered target is healthy and should receive traffic.

Resilient Compute Architectures with Amazon EC2 and AWS Auto Scaling — AWS Solutions Architect Associate (SAA-C03): Complete Exam-Ready Masterclass

From Single EC2 Instance to Resilient Fleets

Why Single EC2 Is Fragile

A single EC2 instance is a single point of failure. If it crashes, is patched badly, or its AZ has an outage, your app goes down and your SLA is at risk.

Reliability Pillar Link

AWS Well-Architected’s Reliability pillar focuses on workloads performing correctly and consistently. For compute, that means fleets, not pets.

Core Resilient Pattern

Standard resilient compute stack: EC2 instances across multiple AZs, managed by an Auto Scaling group, behind an Elastic Load Balancer, with health checks.

Think in RTO/RPO

As you learn each feature, ask: how does this reduce downtime (RTO) or data loss (RPO) when an instance or AZ fails, or when traffic suddenly spikes?

Amazon EC2 Building Blocks for Resilience

EC2 Types and Families

Instance families: t (burstable), m (general), c (compute), r (memory), p (GPU). Resilient fleets may mix types or use Spot plus On-Demand.

Region, AZ, Subnet

Regions are geographic; AZs are isolated locations within a Region; each subnet lives in exactly one AZ. Subnet choice fixes the AZ of an instance.

Multi-AZ for HA

Exam hint: “highly available in a single Region” almost always implies Multi-AZ EC2 plus a load balancer, not just bigger instances in one AZ.

Stateless App Servers

Keep app servers stateless: move sessions and data to RDS, DynamoDB, ElastiCache, or S3. Then Auto Scaling can safely kill and replace instances.

Multi-AZ EC2 Architectures: The Core Resilience Pattern

What Multi-AZ Means

Multi-AZ EC2 architectures spread instances across at least two Availability Zones in one Region, reducing risk from an AZ-level failure.

Typical Web Tier Layout

Public subnets with an ALB in multiple AZs; private subnets in the same AZs; an Auto Scaling group launching EC2 into those private subnets.

Multi-AZ vs Multi-Region

Multi-AZ handles AZ failures with low complexity; Multi-Region addresses Region failures but is more complex and used for stricter SLAs.

Exam Gotchas

An ALB in multiple AZs but targets in one AZ is not truly Multi-AZ. Ensure your ASG uses subnets in at least two AZs.

AWS Auto Scaling Groups: Concepts and Lifecycle

What an ASG Manages

An Auto Scaling group manages a fleet of EC2 instances: min, desired, and max counts, plus which subnets/AZs they run in.

Launch Templates

Launch templates define AMI, instance type, security groups, and user data. The ASG uses them to create new instances consistently.

Instance Lifecycle

ASG launches, instance boots and runs user data, registers with load balancer, passes health checks, or is replaced if unhealthy.

Health Checks for Self-Healing

Combine EC2 and ELB health checks so the ASG can detect both infrastructure and application failures and replace bad instances.

Scaling Policies: Keeping Capacity in Sync With Load

Why Scaling Policies Matter

Scaling policies let your ASG adjust instance count with demand. This protects performance and cost while keeping resilience.

Target Tracking

Set a target metric (like 50% CPU); Auto Scaling adjusts capacity to keep the metric near that value. It is usually the easiest choice.

Step and Scheduled Scaling

Step scaling reacts differently at various thresholds; scheduled scaling changes capacity at fixed times for predictable patterns.

Choosing the Right Policy

Predictable traffic → scheduled. Maintain a metric level → target tracking. Complex rules per threshold → step scaling.

Elastic Load Balancing: ALB vs NLB for Resilience

Role of ELB

Elastic Load Balancing spreads traffic across instances and stops sending requests to unhealthy ones, boosting availability.

ALB vs NLB

ALB is Layer 7 for HTTP/HTTPS with smart routing. NLB is Layer 4 for ultra-high performance, static IPs, and non-HTTP protocols.

Health Checks and Draining

Health checks detect bad targets. Deregistration delay lets connections drain before removing an instance from service.

Exam Clues

HTTP routing and WAF → ALB. Need static IPs or raw TCP/UDP → NLB. Always pair with Multi-AZ targets for resilience.

Design Walkthrough: Resilient Web Tier With EC2, ALB, and Auto Scaling

Scenario Requirements

Single-Region web app, must survive an AZ failure, handle spiky traffic, and avoid overprovisioning. Classic exam-style setup.

Network Layout

Create a VPC with two public subnets for the ALB and two private subnets for EC2, each pair spread across two different AZs.

ALB and Health Checks

Deploy an internet-facing ALB in both public subnets, with HTTP/HTTPS listeners and a `/health` path for target health checks.

ASG Configuration

Use a launch template, choose the two private subnets, set min=2, desired=2, max=6, attach the ALB target group, and enable ELB health checks.

Thought Exercise: Mapping RTO/RPO to EC2 Resilience Patterns

How to Use This Exercise

For each scenario, decide: single instance, Multi-AZ, or Multi-Region, and which EC2, ASG, and ELB features are essential.

Scenario A: Internal App

Internal finance app, business hours only, RTO 4h, RPO 24h. Think: how much Multi-AZ or Auto Scaling is really needed?

Scenario B: E-commerce

Public e-commerce with strict SLA, RTO 15 minutes for AZ failure. Likely needs robust Multi-AZ ALB+ASG and strong database HA.

Scenario C: Batch Jobs

Nightly batch processing with retry tolerance. Focus more on scalable fleets and job retries than strict uptime.

Resilience Patterns for Web, Application, and Batch Workloads

Web Frontend Pattern

Internet-facing ALB, Multi-AZ ASG of stateless EC2 instances, session data offloaded, scaling based on CPU or request count.

Application/API Tier

Internal ALB in private subnets, ASG across AZs, possibly NLB for non-HTTP protocols, often consumed by other services.

Batch Processing Pattern

Workers in an ASG consume from SQS or streams, scale based on queue depth, heavily use Spot with retries for resilience.

Mapping to Pillars

Reliability: Multi-AZ and self-healing. Performance efficiency: right metrics for scaling. Cost optimization: Spot and scaling in.

Quiz 1: Auto Scaling and ELB Basics

Check your understanding of Auto Scaling groups and load balancers in resilient architectures.

You are designing a highly available web application in a single AWS Region. Which combination best ensures that the application remains available if one Availability Zone fails and that capacity adjusts automatically to traffic spikes?

A single EC2 instance with an Elastic IP and a larger instance size
An Auto Scaling group spanning subnets in two AZs, behind an Application Load Balancer with health checks
Two EC2 instances in the same subnet with a Network Load Balancer and no Auto Scaling
An Auto Scaling group in one AZ with a Classic Load Balancer and manual scaling

Show Answer

Answer: B) An Auto Scaling group spanning subnets in two AZs, behind an Application Load Balancer with health checks

The correct answer is the Auto Scaling group spanning subnets in two AZs, behind an ALB with health checks. This provides Multi-AZ high availability, self-healing, and automatic scaling. A single instance, instances in one subnet/AZ, or manual scaling do not meet the resilience and elasticity requirements.

Quiz 2: Choosing Scaling Policies and Patterns

Test how well you can map requirements to scaling policies and workload patterns.

A marketing site has predictable traffic peaks every weekday from 09:00 to 11:00 and low traffic at night. The business wants to minimize cost while keeping performance acceptable. Which Auto Scaling configuration is the BEST fit?

Target tracking scaling on CPU utilization only
Step scaling policies based on ALB request count
Scheduled scaling to increase desired capacity before 09:00 and decrease it after 11:00
No Auto Scaling; use a single large instance sized for peak load

Show Answer

Answer: C) Scheduled scaling to increase desired capacity before 09:00 and decrease it after 11:00

Scheduled scaling is ideal when traffic patterns are predictable. You can increase capacity before the known peak window and reduce it afterwards to save cost. Target tracking and step scaling react to metrics but do not exploit the known schedule as efficiently. A single large instance wastes resources and is not resilient.

Key Term Review: EC2 Resilience and Scaling

Flip through these cards to reinforce key concepts and terminology for resilient EC2 architectures.

Auto Scaling group (ASG): A service that manages a fleet of EC2 instances, maintaining a specified minimum, desired, and maximum capacity, and optionally scaling capacity automatically based on policies and health checks.
Multi-AZ EC2 architecture: An EC2 deployment pattern where instances are distributed across at least two Availability Zones in a Region to improve availability and fault tolerance.
Application Load Balancer (ALB): A Layer 7 load balancer that distributes HTTP/HTTPS and gRPC traffic, supports advanced routing (host/path-based), and integrates with target groups and health checks.
Target tracking scaling policy: An Auto Scaling policy type where you define a target value for a metric (such as CPU utilization), and the ASG automatically adjusts capacity to keep the metric near that value.
Scheduled scaling: An Auto Scaling feature that changes the minimum, maximum, or desired capacity of an Auto Scaling group at specific times based on a schedule.
Stateless application server: An EC2-based application component that does not store user session or critical data locally, allowing instances to be freely terminated and replaced without data loss.
Health check (ELB): A periodic test performed by a load balancer to determine whether a registered target is healthy and should receive traffic.
Deregistration delay (connection draining): A load balancer setting that defines how long to keep existing connections open to a target after it is removed from service, allowing in-flight requests to complete.
Spot Instance (in resilient fleets): A discounted EC2 capacity type that can be interrupted by AWS; often used in Auto Scaling groups for batch or fault-tolerant workloads to reduce cost.
Network Load Balancer (NLB): A Layer 4 load balancer designed for extreme performance and low latency, supporting TCP, UDP, and TLS traffic with static IP addresses.

Key Terms

Spot Instance: A discounted EC2 capacity type that can be interrupted by AWS; often used in Auto Scaling groups for batch or fault-tolerant workloads to reduce cost.
Scheduled scaling: An Auto Scaling feature that changes the minimum, maximum, or desired capacity of an Auto Scaling group at specific times based on a schedule.
Health check (ELB): A periodic test performed by a load balancer to determine whether a registered target is healthy and should receive traffic.
Deregistration delay: A load balancer setting that defines how long to keep existing connections open to a target after it is removed from service, allowing in-flight requests to complete.
Auto Scaling group (ASG): A service that manages a fleet of EC2 instances, maintaining a specified minimum, desired, and maximum capacity, and optionally scaling capacity automatically based on policies and health checks.
Multi-AZ EC2 architecture: An EC2 deployment pattern where instances are distributed across at least two Availability Zones in a Region to improve availability and fault tolerance.
Network Load Balancer (NLB): A Layer 4 load balancer designed for extreme performance and low latency, supporting TCP, UDP, and TLS traffic with static IP addresses.
Stateless application server: An EC2-based application component that does not store user session or critical data locally, allowing instances to be freely terminated and replaced without data loss.
Target tracking scaling policy: An Auto Scaling policy type where you define a target value for a metric (such as CPU utilization), and the ASG automatically adjusts capacity to keep the metric near that value.
Application Load Balancer (ALB): A Layer 7 load balancer that distributes HTTP/HTTPS and gRPC traffic, supports advanced routing (host/path-based), and integrates with target groups and health checks.

From Single EC2 Instance to Resilient Fleets

Why Single EC2 Is Fragile

Reliability Pillar Link

Core Resilient Pattern

Think in RTO/RPO

Amazon EC2 Building Blocks for Resilience

EC2 Types and Families

Region, AZ, Subnet

Multi-AZ for HA

Stateless App Servers

Multi-AZ EC2 Architectures: The Core Resilience Pattern

What Multi-AZ Means

Typical Web Tier Layout

Multi-AZ vs Multi-Region

Exam Gotchas

AWS Auto Scaling Groups: Concepts and Lifecycle

What an ASG Manages

Launch Templates

Instance Lifecycle

Health Checks for Self-Healing

Scaling Policies: Keeping Capacity in Sync With Load

Why Scaling Policies Matter

Target Tracking

Step and Scheduled Scaling

Choosing the Right Policy

Elastic Load Balancing: ALB vs NLB for Resilience

Role of ELB

ALB vs NLB

Health Checks and Draining

Exam Clues

Design Walkthrough: Resilient Web Tier With EC2, ALB, and Auto Scaling

Scenario Requirements

Network Layout

ALB and Health Checks

ASG Configuration

Thought Exercise: Mapping RTO/RPO to EC2 Resilience Patterns

How to Use This Exercise

Scenario A: Internal App

Scenario B: E-commerce

Scenario C: Batch Jobs

Resilience Patterns for Web, Application, and Batch Workloads

Web Frontend Pattern

Application/API Tier

Batch Processing Pattern

Mapping to Pillars

Quiz 1: Auto Scaling and ELB Basics

Quiz 2: Choosing Scaling Policies and Patterns

Key Term Review: EC2 Resilience and Scaling

Key Terms

Finished reading?