Chapter 21 of 26
Cost-Optimized Storage Architectures with Amazon S3 and Related Services
Storage can quietly dominate your bill; learn how to mix S3 storage classes, lifecycle policies, and access patterns to control cost without surprising performance drops.
Big Picture: S3, Cost Optimization, and the Exam
Why Storage Costs Matter
Storage often becomes a silent cost driver: data grows daily and rarely shrinks. For the exam, you must design S3-based storage that is cheap, durable, and aligned to access patterns.
Link to Cost Optimization Pillar
The cost optimization pillar includes the continual process of refinement and improvement of a system over its entire lifecycle to build and operate cost-aware systems that achieve business outcomes and minimize costs.
What You Will Learn
You will choose S3 storage classes, design lifecycle policies, and decide when to use archive tiers like S3 Glacier classes, all while controlling cost and avoiding performance surprises.
Historical vs Current View
S3 has existed since 2006, but storage classes evolved. As of 2026, focus on Standard, Intelligent-Tiering, Standard-IA, One Zone-IA, Glacier Instant, Glacier Flexible, Glacier Deep Archive, and S3 Express One Zone.
Core S3 Concepts: Buckets, Objects, Durability, Availability
Buckets and Objects
A bucket is a regional container with a globally unique name. Objects are files plus metadata, addressed as s3://bucket/key. You pay for GB stored, requests, and data transfer out.
Durability vs Availability
Durability is about not losing data; S3 typically offers 11 9s. Availability is about being able to access data; cheaper classes may reduce availability targets to lower cost.
Multi-AZ vs Single-AZ
Most S3 classes are multi-AZ, replicating across at least three AZs. One Zone-IA and S3 Express One Zone are single-AZ, cheaper but not protected against AZ failure.
Exam Angle
On the exam, cost tradeoffs usually affect availability, retrieval time, and AZ redundancy, not durability. Watch for hints about tolerance for downtime or AZ loss.
Tour of S3 Storage Classes (Active and Infrequent Access)
S3 Standard
S3 Standard is multi-AZ with 11 9s durability and 99.99% availability, low latency, and no minimum duration. Use it for frequently accessed, performance-sensitive data.
S3 Intelligent-Tiering
Intelligent-Tiering automatically moves objects between tiers based on access, for a small monitoring fee. Ideal when you cannot predict which objects will become cold.
S3 Standard-IA
Standard-IA is multi-AZ with lower storage cost but higher retrieval cost and a minimum storage duration. Use it for infrequently accessed but still quickly needed data.
S3 One Zone-IA
One Zone-IA is single-AZ and cheaper, with similar retrieval and minimum duration rules. Use it for re-creatable or non-critical data where AZ loss is acceptable.
Archive Classes: Glacier Instant, Glacier Flexible, Glacier Deep Archive
Glacier Instant Retrieval
Glacier Instant Retrieval is multi-AZ, millisecond access, cheaper than Standard-IA, but with higher retrieval cost and minimum duration. Use it for rare but time-sensitive reads.
Glacier Flexible Retrieval
Glacier Flexible Retrieval offers very low storage cost but retrieval in minutes to hours, with job-based retrieval tiers. Use it when occasional delay is acceptable.
Glacier Deep Archive
Glacier Deep Archive is the cheapest storage, with retrieval in hours and strict minimum durations. It suits long-term compliance data that is almost never accessed.
Exam Hints
Watch for keywords: 'immediate access' rules out Flexible/Deep Archive; '7+ year retention, rarely accessed' strongly suggests Glacier Deep Archive.
Special Case: S3 Express One Zone and Data Lakes
What is S3 Express One Zone?
S3 Express One Zone is single-AZ, ultra-low-latency object storage with very high request rates, using directory buckets and a distinct pricing model.
When to Use It
Use S3 Express One Zone for performance-critical data like real-time analytics or ML feature stores, not for backups or long-term archives.
Cost Optimization Role
It may not be the cheapest per GB, but can lower total cost by shortening compute jobs and reducing reliance on more expensive storage like EBS.
Data Lake Pattern
Typical pattern: hot zones in Standard or Intelligent-Tiering, ultra-hot subsets in Express One Zone, and cold partitions in IA or Glacier via lifecycle rules.
Lifecycle Policies: Automating Transitions and Expirations
What Lifecycle Policies Do
Lifecycle policies automatically transition objects between storage classes or expire them, based on age, prefix, or tags, at bucket or prefix scope.
Example: Logs Pattern
Logs might stay 0–30 days in Standard, 31–90 days in Standard-IA, 91–365 days in Glacier Flexible Retrieval, then be expired or moved to Deep Archive.
Example: Backups Pattern
Backups might start in Standard-IA for fast restore, move to Glacier Flexible Retrieval after a month, and be deleted after several years per retention rules.
Design Tips and Exam Angle
Watch minimum storage durations, use tags to separate policies, and remember that exam questions often ask which class and age thresholds best fit a scenario.
Worked Example: Designing a Cost-Optimized Storage Plan
Scenario Overview
You ingest clickstream logs, need 7 days for real-time dashboards, 90 days for ad-hoc queries, and 3 years total retention, with minimal storage cost.
Hot Data: 0–7 Days
Frequent access and heavy writes mean S3 Standard for `s3://prod-logs/raw/`, providing low latency and high throughput.
Warm Data: 8–90 Days
Occasional queries suggest S3 Intelligent-Tiering or Standard-IA, depending on how predictable your access patterns are.
Cold Archive and Lifecycle Rules
From 91 days to 3 years, use Glacier Flexible Retrieval, with lifecycle rules to transition at 7 and 90 days and expire after 3 years.
Practical: Sample S3 Lifecycle Configuration (JSON)
Here is an example of an S3 lifecycle configuration that implements a common pattern. You do not need to memorize JSON syntax for the exam, but seeing it helps solidify how rules map to behavior.
This rule:
- Applies to objects with prefix `logs/`.
- Transitions them to Standard-IA after 30 days.
- Transitions them to Glacier Flexible Retrieval after 90 days.
- Expires them after 365 days.
```json
{
"Rules": [
{
"ID": "logs-ia-and-glacier",
"Status": "Enabled",
"Filter": {
"Prefix": "logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}
```
On the exam, you will see this kind of logic described in words, not JSON. For example: "After 30 days, move logs to a lower-cost infrequent access storage tier and delete them after 1 year." You must recognize that as a lifecycle rule using Standard-IA and an expiration policy.
Thought Exercise: Mapping Requirements to Storage Classes
Work through this exercise to practice thinking like a solutions architect. You do not need to write code; just reason through each requirement.
Requirements
You are designing storage for three datasets:
- User-uploaded media
- Photos and videos uploaded by users.
- Frequently accessed for the first month, then access drops but never goes to zero.
- Data must be resilient to AZ failure.
- Daily database snapshots
- Created once per day from a production database.
- Most restores happen from the last 7 days; older snapshots are rarely used.
- Must be retained for 1 year for recovery and audit.
- Regulatory email archives
- Must be stored for 10 years.
- Very rarely accessed, but when requested, regulators expect access within a few hours.
Your task
For each dataset, pick:
- Primary S3 storage class for the first 30 days.
- Storage class after 30 days.
- Whether to use lifecycle expiration, and when.
Pause and decide before reading a suggested answer.
Suggested reasoning (check yourself)
- User media: S3 Standard (0–30 days), then S3 Intelligent-Tiering or Standard-IA (after 30 days). No expiration (user data).
- Snapshots: S3 Standard or Standard-IA (0–7 days), then Glacier Instant or Glacier Flexible Retrieval (7–365 days), expire after 1 year.
- Email archives: S3 Standard (initial ingestion), then Glacier Deep Archive for up to 10 years, expire after 10 years.
Compare your choices. Where did you trade latency vs cost differently? How would you justify your design to a non-technical stakeholder?
Quiz 1: Picking Storage Classes
Answer this question to check your understanding of S3 storage classes and lifecycle design.
A company stores application logs that are heavily accessed for 14 days, occasionally queried for up to 90 days, and then kept for 3 years for compliance. Queries after 90 days are extremely rare and can tolerate hours of retrieval delay. Which combination is the MOST cost-optimized while meeting requirements?
- Keep all logs in S3 Standard for 3 years.
- Store logs in S3 Standard for 14 days, then S3 Standard-IA until 90 days, then S3 Glacier Deep Archive until 3 years.
- Store logs in S3 Standard for 14 days, then S3 Intelligent-Tiering until 90 days, then S3 Glacier Flexible Retrieval until 3 years.
- Store logs in S3 Standard-IA for 90 days, then delete them after 90 days.
Show Answer
Answer: C) Store logs in S3 Standard for 14 days, then S3 Intelligent-Tiering until 90 days, then S3 Glacier Flexible Retrieval until 3 years.
Option 3 is best: S3 Standard for hot access (0–14 days), Intelligent-Tiering for unpredictable but less frequent access (15–90 days), and Glacier Flexible Retrieval for long-term archive with rare access that can tolerate minutes to hours of delay. Option 2 uses Deep Archive, which is cheaper but may be too slow if regulators expect access within hours without explicit acceptance of very long delays. Option 1 is too expensive, and Option 4 violates the 3-year retention requirement.
Quiz 2: Lifecycle and Availability Tradeoffs
Another quick check on lifecycle policies and availability tradeoffs.
You are designing a backup solution for a non-critical internal system. Backups must be kept for 6 months. The data is re-creatable from source systems, and management wants the lowest possible storage cost. Occasional restores are needed within minutes. Which design is MOST appropriate?
- Store all backups in S3 Standard for 6 months.
- Store backups in S3 One Zone-IA for 6 months using a lifecycle rule to expire them after 6 months.
- Store backups in S3 Glacier Deep Archive for 6 months.
- Store backups in S3 Express One Zone for 6 months.
Show Answer
Answer: B) Store backups in S3 One Zone-IA for 6 months using a lifecycle rule to expire them after 6 months.
S3 One Zone-IA is single-AZ, cheaper than multi-AZ classes, and suitable for re-creatable, non-critical data. It still provides minutes-level restore (like other IA classes). A lifecycle rule can expire backups after 6 months. S3 Standard is more expensive than needed, Glacier Deep Archive has hours-level retrieval delays, and S3 Express One Zone is for ultra-low-latency workloads, not backups.
Key Terms Review
Flip these cards to reinforce core S3 cost-optimization concepts.
- Durability vs Availability in S3
- Durability is the probability that data is not lost (S3 commonly offers 11 9s). Availability is the percentage of time data is accessible. Cheaper S3 classes usually trade availability or retrieval characteristics, not durability.
- S3 Standard-IA
- A multi-AZ storage class with lower storage cost but higher retrieval cost and a minimum storage duration. Used for infrequently accessed data that still requires rapid access when needed.
- S3 One Zone-IA
- A single-AZ infrequent access class with lower cost than Standard-IA. Suitable for re-creatable or non-critical data where loss of an AZ is acceptable.
- S3 Intelligent-Tiering
- A storage class that automatically moves objects between access tiers based on access patterns, for a small monitoring fee, to optimize cost when access patterns are unpredictable.
- S3 Glacier Instant Retrieval
- Archive storage with millisecond retrieval but higher per-GB retrieval cost and minimum storage duration. Used for rarely accessed data that must still be retrieved immediately.
- S3 Glacier Flexible Retrieval
- Low-cost archive storage with retrieval in minutes to hours using retrieval jobs. Suitable for data accessed a few times per year where some delay is acceptable.
- S3 Glacier Deep Archive
- The lowest-cost S3 storage class with retrieval in hours and strict minimum duration. Ideal for long-term compliance archives that are almost never accessed.
- S3 Lifecycle Policy
- A set of rules that automatically transition objects between storage classes or expire them based on object age, prefixes, or tags, helping control storage costs over time.
- S3 Express One Zone
- Single-AZ, ultra-low-latency, high-throughput object storage for performance-critical workloads. Not typically used for backups or long-term archives.
- Cost Optimization Pillar (Well-Architected)
- The cost optimization pillar includes the continual process of refinement and improvement of a system over its entire lifecycle to build and operate cost-aware systems that achieve business outcomes and minimize costs.
Bringing It Together: Patterns, Exam Signals, and Next Steps
Key Patterns
Remember patterns: data lakes use Standard/Intelligent-Tiering then IA/Glacier; backups use IA then Glacier; compliance archives lean on Deep Archive with lifecycle rules.
Reading Exam Signals
Immediate low-latency access points away from slow Glacier classes; re-creatable or non-critical data with single-AZ tolerance points toward One Zone-IA or Express One Zone.
Link to Well-Architected
S3 choices affect cost optimization, reliability (multi-AZ vs single-AZ), performance efficiency (hot data), and sustainability (deleting or archiving unused data).
Your Next Step
Use Skarp’s diagnostic or mock exam for storage. Missed items will appear in your spaced review, and the gap guide will deepen coverage where you need it most.
Key Terms
- Bucket
- A container for objects stored in Amazon S3, scoped to an AWS Region and globally unique in name.
- Object
- The fundamental entity stored in Amazon S3, consisting of data and metadata and identified by a key within a bucket.
- Amazon S3
- A highly scalable, durable, and secure object storage service from AWS used for storing and retrieving any amount of data at any time.
- Durability
- The probability that data stored will not be lost over a given year; S3 commonly offers 99.999999999% durability.
- S3 Standard
- The default S3 storage class offering high durability, high availability, low latency, and no minimum storage duration for frequently accessed data.
- Availability
- The percentage of time that a system or service is operational and accessible when requested.
- S3 One Zone-IA
- An S3 storage class that stores data in a single Availability Zone at lower cost, suitable for re-creatable or non-critical data.
- S3 Standard-IA
- An S3 storage class for infrequently accessed data that still requires rapid access, with lower storage cost and higher retrieval cost than S3 Standard.
- S3 Express One Zone
- A single-AZ, ultra-low-latency S3 storage class for performance-critical workloads requiring very high request rates.
- S3 Lifecycle Policy
- A configuration attached to an S3 bucket that defines rules for automatic transition of objects between storage classes and expiration of objects.
- S3 Intelligent-Tiering
- An S3 storage class that automatically moves objects between access tiers based on changing access patterns to optimize cost.
- S3 Glacier Deep Archive
- The lowest-cost S3 storage class, designed for long-term data archiving that is rarely accessed and can tolerate hours of retrieval time.
- Cost optimization pillar
- The cost optimization pillar includes the continual process of refinement and improvement of a system over its entire lifecycle to build and operate cost-aware systems that achieve business outcomes and minimize costs.
- S3 Glacier Instant Retrieval
- An S3 archive storage class that provides millisecond retrieval for rarely accessed data with lower storage cost and higher retrieval cost.
- S3 Glacier Flexible Retrieval
- An S3 archive storage class with very low storage cost and retrieval in minutes to hours via retrieval jobs.
- AWS Well-Architected Framework
- The AWS Well-Architected Framework provides a consistent set of best practices for customers and partners to evaluate architectures, and a set of questions you can use to evaluate how well an architecture is aligned to AWS best practices.