Auto Scaling group (ASG)

A logical group of EC2 instances that can automatically increase, decrease, or replace instances based on scaling policies, health checks, and capacity settings (min, max, desired).

Target tracking scaling policy

An Auto Scaling policy where you choose a metric and a target value, and AWS adjusts capacity to keep the metric near that target, similar to a thermostat.

An Auto Scaling policy that uses CloudWatch alarms and threshold ranges to apply different scaling adjustments (steps) depending on how far the metric is beyond the threshold.

A fully managed message queue service that decouples producers and consumers, allowing asynchronous processing and backpressure via queueing.

A fully managed pub/sub messaging service that delivers messages from publishers to multiple subscribers via topics.

A fully managed message queue service that decouples producers and consumers, enabling asynchronous processing and buffering.

What is Backpressure?

A technique for handling overload by slowing, limiting, or buffering incoming work, often implemented using queues or streams.

AWS Auto Scaling and Loosely Coupled Architectures with Queues and Streams — AWS Solutions Architect Associate (SAA‑C03): Complete Exam-Ready Masterclass

Big Picture: Scaling and Decoupling for Resilience

Why This Module Matters

Here you connect Multi-AZ and load balancing with two key patterns: automatic scaling and loose coupling using queues and streams. These appear constantly in Solutions Architect Associate scenarios.

What You Will Learn

You will design Auto Scaling groups, pick scaling policies, decouple components with SQS/SNS, and use Kinesis/EventBridge to handle variable load and failures gracefully.

Well-Architected Context

This module lives in the Reliability and Performance efficiency pillars, helping workloads stay available and responsive as demand changes and components fail.

Your Exam Job

In scenarios, you must recognize when to scale with ASGs, when to buffer with queues or streams, and which AWS service best matches the workload pattern.

Auto Scaling Groups: Core Concepts and AZ Design

What Is an Auto Scaling Group?

An ASG is a logical group of EC2 instances that can automatically increase, decrease, or replace instances based on rules, using a launch template or configuration.

Capacity Settings

An ASG tracks minimum, maximum, and desired capacity. Minimum keeps a baseline, maximum caps costs, desired is the current target number of instances.

Multi-AZ by Default

For resilience, place ASGs in at least two AZs and attach them to an ALB or NLB so traffic goes only to healthy instances across zones.

Health Checks and Replacement

The ASG uses ELB or EC2 health checks. When an instance is unhealthy, it is terminated and replaced automatically to maintain desired capacity.

Exam Clues

Phrases like "automatic recovery across AZs" or "replace failed instances with no manual work" strongly hint at using an ASG plus a load balancer.

Scaling Policies: Target Tracking, Step, and Scheduled

Why Policies Matter

An ASG needs rules for when to add or remove instances. Scaling policies define this behavior based on metrics, thresholds, or time schedules.

Target Tracking

Target tracking keeps a metric, such as average CPU or ALB requests per target, near a chosen value. It is like a thermostat and is often the default choice.

Step Scaling

Step scaling uses CloudWatch alarms with thresholds. Different ranges trigger different step adjustments, giving fine-grained control over how much to scale.

Scheduled Scaling

Scheduled scaling changes capacity at specific times, ideal for predictable patterns like business hours or known monthly peaks.

Exam Selection Hints

Unpredictable spikes → target tracking; predictable time-based load → scheduled; specific threshold-based behavior → step scaling.

Designing an ASG Behind an ALB: A Worked Scenario

Scenario Overview

A university web app has low baseline traffic but huge spikes at semester start. You need high availability across AZs and automatic scaling without manual work.

Network and Template

Use at least two public subnets for the ALB and two private subnets for EC2. Create a launch template with AMI, instance type, security groups, and user data.

Auto Scaling Group Setup

Create an ASG in both private subnets with min 2, max 20, desired 2 instances so you always have one instance per AZ for high availability.

ALB and Target Group

Deploy an ALB in the public subnets, create a target group, attach the ASG, and configure health checks on an endpoint like /health.

Scaling Policy Choice

Use target tracking on ALB RequestCountPerTarget, plus an optional scheduled action to raise baseline capacity during the known registration week.

Decoupling with Amazon SQS and Amazon SNS

Why Decouple?

Tightly coupled components fail together. Decoupling with queues and topics lets producers and consumers operate at different speeds and survive temporary failures.

Amazon SQS Basics

SQS is a managed message queue. Producers send messages; consumers poll and process them asynchronously, then delete them on success.

Standard vs FIFO Queues

Standard queues offer high throughput with at-least-once delivery. FIFO queues give strict ordering and exactly-once processing with lower throughput.

Amazon SNS Basics

SNS is a pub/sub service. Publishers send to a topic; multiple subscribers like SQS queues, Lambda, or HTTP endpoints each receive a copy.

Fan-Out Pattern

Use SNS to fan out events to multiple SQS queues so billing, analytics, and notifications can process the same event independently and at their own pace.

Streams and Event Buses: Kinesis and EventBridge

Beyond Queues

Queues handle work items well, but some systems need continuous event streams or flexible routing between many services. That is where Kinesis and EventBridge help.

Kinesis Data Streams

Kinesis stores ordered events in shards. Producers write data; multiple consumers read and can replay events during the retention window.

EventBridge as Event Bus

EventBridge is a serverless event bus. It matches events based on patterns and routes them to targets like Lambda, SQS, SNS, and Step Functions.

Choosing Between Services

SQS for work queues, SNS for fan-out notifications, Kinesis for high-volume ordered streams, EventBridge for rich event routing and SaaS integrations.

Exam Clues

Phrases like "event bus" or "route events between multiple AWS services and SaaS" point to EventBridge, not SNS or SQS.

Graceful Degradation and Backpressure

What Is Graceful Degradation?

Graceful degradation means the system still works in a reduced mode under stress, instead of failing completely when components are overloaded.

Queues as Backpressure

SQS queues absorb bursts when consumers or databases are slow. The queue length grows instead of causing timeouts or crashes.

Protecting Downstream Systems

Have applications enqueue work and let a controlled set of consumers write to the database at a safe rate, preventing overload.

Load Shedding

Under extreme load, APIs can disable non-essential features, return simpler responses, or respond with 429 and retry hints.

Exam Clues

Phrases like "continue operating but accept delays" or "protect the database from spikes" usually point to inserting a queue or stream.

Thought Exercise: Tight vs Loosely Coupled Design

Use this exercise to practice identifying tight coupling and redesigning with queues or streams.

Scenario A (tight coupling):

A mobile app calls an API Gateway endpoint.
The API triggers a Lambda function.
The Lambda function calls an RDS database synchronously for each request.
During peak usage, RDS saturates and the Lambda invocations start timing out, causing user-facing failures.

Questions to think through:

Where is the tight coupling in this design?
How could you use SQS or Kinesis to decouple the system and introduce backpressure?
How would this change the user experience and consistency model (immediate vs eventual)?
Which metrics would you monitor to know when the system is under stress?

One possible redesign:

API Gateway and Lambda validate requests and quickly enqueue them to SQS.
A fleet of consumers (Lambda or EC2 in an ASG) reads from the queue and writes to RDS at a controlled rate.
The mobile app receives an immediate acknowledgment like "request accepted" and can poll or subscribe to a status update.

Reflection prompts:

In exam questions, what phrases suggest that eventual consistency and delayed processing are acceptable?
When would this pattern not be appropriate (e.g., payment authorization that must be synchronous)?

Write down your own variant of this pattern using EventBridge or Kinesis instead of SQS, and note why you chose that service.

Quiz 1: Auto Scaling and Policies

Test your understanding of ASGs and scaling policies.

A startup runs a web app on EC2 behind an Application Load Balancer in one Region. Traffic is unpredictable and can spike suddenly after social media posts. They want to maintain performance, minimize manual intervention, and keep costs low during quiet periods. Which configuration best meets these needs?

Use an Auto Scaling group across two AZs with minimum capacity 2, attach it to the ALB, and configure target tracking scaling on ALB RequestCountPerTarget.
Increase the EC2 instance size to the largest available type and keep a single instance behind the ALB.
Create an Auto Scaling group in one AZ with scheduled scaling to increase capacity every day at 9 AM and decrease at 5 PM.
Use an Auto Scaling group with step scaling on CPU utilization, but fix minimum, maximum, and desired capacity to the same high value.

Show Answer

Answer: A) Use an Auto Scaling group across two AZs with minimum capacity 2, attach it to the ALB, and configure target tracking scaling on ALB RequestCountPerTarget.

Option 1 uses an ASG across multiple AZs for high availability, attaches it to the ALB, and uses target tracking on a load-related metric to adjust capacity automatically. This keeps costs low when idle and scales out for spikes. Option 2 is a single point of failure and wastes capacity. Option 3 only handles predictable time-based patterns, not unpredictable spikes. Option 4 disables true scaling by fixing capacity at a high value.

Quiz 2: Queues, Topics, Streams, and Event Buses

Check your understanding of decoupling choices.

An e-commerce platform needs to process "order placed" events. Multiple systems must react: billing, inventory, email notifications, and analytics. Each system should process events independently at its own pace. The architects also want to add new consumers in the future without changing the producers. Which design is most appropriate?

Have the order service write directly to four different RDS databases, one for each downstream system.
Publish events to an SNS topic and subscribe separate SQS queues for billing, inventory, email, and analytics.
Send events to a single SQS FIFO queue that all four systems read from in turn.
Write events to a single Kinesis shard and have each system read from the shard using a single shared consumer application.

Show Answer

Answer: B) Publish events to an SNS topic and subscribe separate SQS queues for billing, inventory, email, and analytics.

Option 2 uses SNS for pub/sub fan-out and SQS queues for each consumer so they can process at their own pace. Producers do not need to know about individual consumers, and new subscribers can be added easily. Option 1 tightly couples the order service to each system. Option 3 forces consumers to compete for messages instead of each getting a copy. Option 4 could work with Kinesis, but a single shard and shared consumer app limit independence and scalability; SNS + SQS is the clearer exam-level answer for this pattern.

Key Term Review: Auto Scaling and Decoupling

Flip through these flashcards to reinforce core concepts.

Auto Scaling group (ASG): A logical group of EC2 instances that can automatically increase, decrease, or replace instances based on scaling policies, health checks, and capacity settings (min, max, desired).
Target tracking scaling policy: An Auto Scaling policy where you choose a metric and a target value, and AWS adjusts capacity to keep the metric near that target, similar to a thermostat.
Step scaling policy: An Auto Scaling policy that uses CloudWatch alarms and threshold ranges to apply different scaling adjustments (steps) depending on how far the metric is beyond the threshold.
Scheduled scaling: An Auto Scaling feature that adjusts capacity at specific times, ideal for predictable daily or weekly traffic patterns.
Amazon SQS: A fully managed message queue service that decouples producers and consumers, allowing asynchronous processing and backpressure via queueing.
Amazon SNS: A fully managed pub/sub messaging service where publishers send messages to topics and multiple subscribers each receive a copy.
Amazon Kinesis Data Streams: A managed streaming data service for high-throughput, ordered event streams with multiple consumers and the ability to replay data within the retention period.
Amazon EventBridge: A serverless event bus service that routes events from AWS services, SaaS apps, and custom apps to targets based on event patterns.
Graceful degradation: A design approach where a system continues to operate with reduced functionality or quality under failure or overload, instead of failing completely.
Backpressure: A mechanism to handle overload by slowing down or buffering incoming work, often implemented with queues so that downstream systems are not overwhelmed.

Key Terms

Amazon SNS: A fully managed pub/sub messaging service that delivers messages from publishers to multiple subscribers via topics.
Amazon SQS: A fully managed message queue service that decouples producers and consumers, enabling asynchronous processing and buffering.
Backpressure: A technique for handling overload by slowing, limiting, or buffering incoming work, often implemented using queues or streams.
Scheduled scaling: A feature that changes Auto Scaling group capacity at specified times, useful for predictable load patterns.
Amazon EventBridge: A serverless event bus that routes events from AWS services, SaaS providers, and custom apps to various targets based on matching rules.
Step scaling policy: An Auto Scaling policy that applies different scaling adjustments based on how far a metric deviates from a threshold, using CloudWatch alarms.
Graceful degradation: The ability of a system to continue operating in a reduced or limited mode when parts of it fail or are overloaded.
Auto Scaling group (ASG): A logical group of EC2 instances that can automatically increase, decrease, or replace instances based on scaling policies, health checks, and capacity settings.
Amazon Kinesis Data Streams: A managed streaming data service for high-throughput, ordered event ingestion and processing with multiple consumers.
Target tracking scaling policy: An Auto Scaling policy that keeps a chosen metric near a target value by automatically adjusting capacity.

Big Picture: Scaling and Decoupling for Resilience

Why This Module Matters

What You Will Learn

Well-Architected Context

Your Exam Job

Auto Scaling Groups: Core Concepts and AZ Design

What Is an Auto Scaling Group?

Capacity Settings

Multi-AZ by Default

Health Checks and Replacement

Exam Clues

Scaling Policies: Target Tracking, Step, and Scheduled

Why Policies Matter

Target Tracking

Step Scaling

Scheduled Scaling

Exam Selection Hints

Designing an ASG Behind an ALB: A Worked Scenario

Scenario Overview

Network and Template

Auto Scaling Group Setup

ALB and Target Group

Scaling Policy Choice

Decoupling with Amazon SQS and Amazon SNS

Why Decouple?

Amazon SQS Basics

Standard vs FIFO Queues

Amazon SNS Basics

Fan-Out Pattern

Streams and Event Buses: Kinesis and EventBridge

Beyond Queues

Kinesis Data Streams

EventBridge as Event Bus

Choosing Between Services

Exam Clues

Graceful Degradation and Backpressure

What Is Graceful Degradation?

Queues as Backpressure

Protecting Downstream Systems

Load Shedding

Exam Clues

Thought Exercise: Tight vs Loosely Coupled Design

Quiz 1: Auto Scaling and Policies

Quiz 2: Queues, Topics, Streams, and Event Buses

Key Term Review: Auto Scaling and Decoupling

Key Terms

Finished reading?