Chapter 5 of 9
High-Performing Compute: EC2, Autoscaling, and Serverless Choices
From legacy lift‑and‑shift to modern serverless, the exam loves to ask which compute option is “best” for a particular workload. This module walks through the tradeoffs between EC2, Auto Scaling, containers, and Lambda so the right answer jumps out at you.
Big Picture: Choosing the Right Compute Model
Compute Spectrum
In AWS, compute options range from most control to most managed: EC2, Auto Scaling with load balancers, containers (ECS/EKS, often with Fargate), and AWS Lambda.
Your Exam Job
On the exam you must: 1) identify the workload pattern, 2) choose the right compute primitive, and 3) pick a cost model and scaling approach that fit the scenario.
Stable Patterns
Although AWS keeps adding features, the core patterns for choosing between EC2, containers, and Lambda have stayed stable and are what the exam focuses on.
Step 1: EC2 Instance Families – Match the Hardware to the Job
Why EC2?
Use EC2 when you need full control over the OS, networking, or specialized hardware. The key is matching instance families to workload needs.
General vs Specialized
General purpose (M, T, A) fit most web/app workloads. Compute (C) favors CPU-heavy tasks; Memory (R, X) fits in-memory DBs and caches.
Storage & Accelerators
Storage optimized (I, D, H) target high IOPS and throughput. Accelerated (P, G, Trn, Inf) provide GPUs or chips for ML, inference, or graphics.
Exam Heuristics
Keywords guide you: 'in-memory' → memory optimized; 'high IOPS' → storage optimized; 'GPU/ML' → accelerated; no special need → general purpose.
Step 2: EC2 Purchasing Options – Balance Cost and Flexibility
On-Demand First
On-Demand is pay-as-you-go with no commitment. Use it for new, spiky, or short-lived workloads where you cannot yet predict usage.
Commit for Savings
Reserved Instances and Savings Plans give big discounts for 1–3 year commitments. Use them for steady-state, always-on workloads.
Spot for Cheap, Flexible Work
Spot Instances are very cheap but interruptible. Great for stateless, fault-tolerant, or batch jobs, not for single critical DB servers.
Exam Pattern
Baseline steady load → RIs/Savings Plans; unpredictable → On-Demand; interrupt-tolerant extra capacity → Spot; Lambda/Fargate → Savings Plans only.
Step 3: Auto Scaling Groups – Scaling EC2 Up and Down
What an ASG Does
An Auto Scaling group automatically adjusts EC2 instance count between a min and max, often across multiple AZs, based on demand.
Scaling Policies
Target tracking keeps a metric (like CPU) near a target. Step scaling uses thresholds and steps. Scheduled scaling follows known time patterns.
Lifecycle Hooks
Lifecycle hooks let you run custom actions when instances launch or terminate, such as configuration or connection draining.
Exam Hints
CPU-based automatic scaling → target tracking; predictable hours → scheduled; need warm-up or cleanup logic → lifecycle hooks.
Step 4: Elastic Load Balancing – Match the Load Balancer to the App
Why Load Balancers?
Elastic Load Balancing spreads traffic across multiple targets to improve availability and scalability. It is commonly paired with Auto Scaling groups.
Application Load Balancer
ALB is Layer 7 for HTTP/HTTPS. It supports path- and host-based routing, WebSockets, and can route to EC2, IPs, or Lambda functions.
Network & Gateway
NLB is Layer 4 for TCP/UDP with static IPs and very high performance. GWLB inserts virtual appliances like firewalls into traffic flows.
Exam Hints
HTTP routing or microservices → ALB. Non-HTTP or static IP → NLB. Virtual firewall appliance → GWLB. CLB is legacy.
Step 5: EC2 + ASG + ALB Pattern (Visualized)
Scenario
A legacy three-tier web app is lifted to AWS. Users access an HTTPS endpoint, and you must design a scalable, highly available architecture.
Architecture Pieces
Use an ALB in public subnets, an Auto Scaling group of EC2 instances across at least two AZs, and an RDS or Aurora DB in private subnets.
How It Flows
Users hit the ALB over HTTPS; ALB forwards to EC2 targets in private subnets; ASG scales instances; instances talk to the DB privately.
Why It Fits
This pattern provides high availability, horizontal scaling, and reasonable cost. If the question wants 'no server management', consider containers or Lambda instead.
Step 6: Serverless Compute – Lambda and Fargate
What is Serverless?
Serverless means AWS manages the servers and scaling. You focus on code and config. Lambda and Fargate are the main serverless compute options.
AWS Lambda
Lambda runs functions on demand, triggered by events. It scales automatically, bills per invocation and duration, and is great for spiky, event-driven tasks.
AWS Fargate
Fargate runs containers defined via ECS or EKS without you managing EC2 nodes. You pay per vCPU and memory requested for each task.
Choosing Among Them
Event-driven, short tasks → Lambda. Containerized microservices without node management → Fargate. Legacy or long-running workloads → EC2.
Step 7: Decision Drill – Which Compute Option?
Work through these scenarios mentally. Say your answer out loud or jot it down, then check the suggested solution below each.
- Scenario A
- A video transcoding job runs overnight, can be retried if interrupted, and must be as cheap as possible.
- Needs GPU acceleration for performance.
- Your choice? Why?
Suggested answer:
- Use EC2 with GPU instances (e.g., P or G family) in an Auto Scaling group using Spot Instances.
- Reason: GPU requirement pushes you to EC2; fault-tolerant batch job fits Spot for maximum savings.
- Scenario B
- A public REST API with unpredictable traffic spikes.
- You want zero capacity planning and pay only when requests arrive.
- Your choice? Why?
Suggested answer:
- Use API Gateway + AWS Lambda.
- Reason: event-driven HTTP API, spiky traffic, desire for no idle cost all point to serverless Lambda.
- Scenario C
- A microservices app already containerized using Docker.
- Team does not want to manage EC2 nodes but needs long-running services.
- Your choice? Why?
Suggested answer:
- Use Amazon ECS on Fargate (or EKS on Fargate if Kubernetes skills are strong).
- Reason: containers + no node management → Fargate.
- Scenario D
- A legacy Windows application that requires specific drivers and runs 24/7.
- Traffic is stable and predictable.
- Your choice? Why?
Suggested answer:
- Use Windows EC2 instances with Reserved Instances or Savings Plans.
- Reason: legacy + custom drivers → EC2; steady 24/7 load → long-term commitment pricing.
Step 8: Quick Check – Scaling and Load Balancing
Answer this exam-style question, then read the explanation.
A startup runs a stateless web application on EC2 instances behind an Application Load Balancer. Traffic is highly variable and unpredictable. They want to automatically scale capacity based on CPU utilization while minimizing operational effort. Which configuration is MOST appropriate?
- Use a fixed-size Auto Scaling group with manual scaling when CPU is high.
- Use an Auto Scaling group with target tracking scaling policy based on average CPU utilization.
- Use an Auto Scaling group with scheduled scaling actions for peak hours.
- Replace the ALB with a Network Load Balancer and use step scaling policies.
Show Answer
Answer: B) Use an Auto Scaling group with target tracking scaling policy based on average CPU utilization.
Target tracking scaling based on average CPU is the simplest and most appropriate for unpredictable traffic. It automatically adjusts capacity to maintain a target metric. Manual scaling (A) and scheduled scaling (C) do not fit highly variable, unpredictable traffic. Option D changes the load balancer type without addressing the scaling requirement and step scaling is more complex than needed.
Step 9: Quick Check – Cost and Serverless
Test your understanding of pricing models and serverless choices.
You have several Lambda functions and Fargate tasks that run continuously for a steady, predictable workload. You want to reduce cost over the next 3 years without losing flexibility to change instance families or Regions. What is the BEST option?
- Purchase 3-year Standard Reserved Instances for EC2.
- Purchase 3-year Compute Savings Plans.
- Use Spot Instances for Lambda and Fargate.
- Purchase 1-year EC2 Instance Savings Plans.
Show Answer
Answer: B) Purchase 3-year Compute Savings Plans.
Compute Savings Plans apply to Lambda, Fargate, and EC2 across instance families and Regions, making them ideal for steady, predictable serverless and container workloads. EC2 RIs and EC2 Instance Savings Plans (A and D) do not apply directly to Lambda and Fargate. Spot pricing (C) does not apply to Lambda or Fargate and does not provide a 3-year cost commitment model.
Step 10: Flashcard Review – Key Terms and Patterns
Flip through these cards to reinforce the main concepts.
- When to choose General Purpose EC2 (M/T/A)?
- When the workload has balanced CPU, memory, and networking needs, such as standard web/app servers and small databases, and no special optimization requirement is mentioned.
- Key clue for Memory Optimized EC2 (R/X)?
- Keywords like "in-memory", "large cache", "big analytics datasets in RAM", or very memory-heavy databases.
- Best pricing model for steady 24/7 production workloads?
- Reserved Instances or Savings Plans (often Compute Savings Plans for flexibility), because they trade long-term commitment for significant discounts.
- When are Spot Instances appropriate?
- For fault-tolerant, stateless, or batch workloads that can handle interruptions and flexible start/end times, such as offline processing or background jobs.
- Target tracking scaling vs scheduled scaling?
- Target tracking keeps a metric (e.g., CPU) near a target for unpredictable traffic. Scheduled scaling adjusts capacity based on known time-based patterns (e.g., business hours).
- ALB vs NLB – exam keywords?
- ALB: HTTP/HTTPS, path/host routing, microservices, WebSockets. NLB: TCP/UDP, static IPs, extremely high performance or low latency, non-HTTP protocols.
- When to use AWS Lambda?
- For event-driven, short-lived tasks (up to 15 minutes), spiky workloads, or APIs where you want no idle cost and do not want to manage servers.
- When to use AWS Fargate?
- For containerized applications where you want the flexibility of containers but do not want to manage EC2 nodes, such as microservices or long-running workers.
- Classic pattern: scalable web app on EC2?
- Application Load Balancer in public subnets → Auto Scaling group of EC2 instances across multiple AZs in private subnets → database in private subnets (RDS/Aurora).
- Graviton instances – why and when?
- ARM-based EC2 (e.g., m7g, c7g) that often provide better price/performance. Use them when your applications can run on ARM and the question emphasizes cost optimization.
Key Terms
- AWS Lambda
- Serverless compute service that runs code in response to events, automatically managing scaling and infrastructure.
- Amazon EC2
- Elastic Compute Cloud; virtual servers in the cloud where you manage the OS and runtime.
- AWS Fargate
- Serverless compute engine for containers that runs tasks or pods without you managing EC2 instances.
- Savings Plans
- Flexible pricing model where you commit to a consistent amount of compute usage (e.g., $/hour) for 1 or 3 years in exchange for lower prices.
- Spot Instances
- Discounted EC2 capacity that can be interrupted by AWS with a short warning, suited for fault-tolerant and flexible workloads.
- Instance family
- A group of EC2 instance types optimized for a particular resource profile (general purpose, compute, memory, storage, or accelerated).
- Reserved Instances
- EC2 pricing option offering discounts in exchange for committing to a specific instance configuration for 1 or 3 years.
- Target tracking scaling
- An Auto Scaling policy that adjusts capacity to maintain a specific metric target, such as average CPU utilization.
- Auto Scaling group (ASG)
- A group of EC2 instances that can automatically increase or decrease in number based on scaling policies.
- Network Load Balancer (NLB)
- A Layer 4 load balancer for TCP/UDP/TLS traffic, optimized for high performance and static IPs.
- Elastic Load Balancing (ELB)
- AWS service that distributes incoming traffic across multiple targets to improve availability and scalability.
- Application Load Balancer (ALB)
- A Layer 7 load balancer for HTTP/HTTPS traffic, supporting advanced routing such as path- and host-based rules.