SkarpSkarp

Chapter 7 of 27

Planning Object Storage and Lifecycle Management with Cloud Storage

From hot content delivery to long-term archives, Cloud Storage underpins many solutions; design buckets, classes, and lifecycles that meet performance and cost goals.

27 min readen

Cloud Storage as Foundational Object Storage

Cloud Storage in Google Cloud

Cloud Storage is Google Cloud's foundational object storage service for unstructured data like images, logs, backups, and static website assets.

Objects and Buckets

You store data as objects inside buckets. Each object has data plus metadata, and each bucket has settings for location, class, access control, and lifecycle.

Key Properties

Cloud Storage offers global bucket names, 11 nines of durability, high availability, and strong read-after-write and list-after-write consistency.

Why It Matters for the Exam

Cloud Storage shows up when you plan, deploy, and secure solutions. You must be able to design buckets, pick classes, and configure lifecycle rules.

Buckets, Locations, and Access Patterns

Bucket Locations

Cloud Storage buckets live in a region, dual-region, or multi-region, which affects latency, redundancy, and cost.

Region vs Dual-region vs Multi-region

Regions store data in one location, dual-regions in two paired regions, and multi-regions across a large geographic area like `us` or `eu`.

Access Patterns

Think in terms of hot, warm, and cold data: how frequently data is read, and how quickly it must be available.

Exam Mindset

For scenarios, match bucket location and class to user or compute location, access frequency, and compliance or durability needs.

Cloud Storage Classes (including Regional Persistent Disk)

The 5 Storage Classes

You must be able to enumerate: Standard, Nearline, Coldline, Archive, Regional Persistent Disk.

Standard and Nearline

Standard is for hot, frequently accessed data. Nearline is cheaper storage for data accessed about once a month.

Coldline and Archive

Coldline targets data read once a quarter or less. Archive is the lowest-cost option for data accessed less than once a year.

Regional Persistent Disk

Regional Persistent Disk is block storage for VMs, synchronously replicated across zones. It is not object storage but appears in this canonical list.

Choosing Storage Classes for Real Workloads

Scenario 1: Global Images

Global image hosting: frequent reads, global users, high availability. Use Standard in a multi-region like `us` or `eu`.

Scenario 2: Backups

Daily DB backups, rarely read, 1-year retention: start in Nearline, then lifecycle to Coldline or Archive after 90–180 days.

Scenario 3: Analytics Logs

Logs heavily queried for a short time, then rarely: keep in Standard initially, then transition to Nearline/Coldline, then Archive.

Exam Signal Words

Look for "frequently accessed" vs "rarely accessed" and "regulatory retention" to choose among Standard, Nearline, Coldline, Archive.

Data Durability, Availability, and Cost Trade-offs

Durability

All Cloud Storage classes target 11 nines of durability using redundant storage within a region, dual-region, or multi-region.

Availability

Availability is about uptime. Multi-region and dual-region Standard offer higher availability than single-region or colder classes.

Cost Components

Costs include storage per GB per month, operation and retrieval charges, and network egress. Colder classes cut storage cost but raise access cost.

Common Exam Trap

Do not pick Archive for data accessed weekly. High retrieval and operation costs make it unsuitable for frequent access.

Lifecycle Management: Concepts and Conditions

What Is Lifecycle Management?

Lifecycle management automates actions like changing storage class or deleting objects based on rules at the bucket level.

Actions and Conditions

Each rule has an action (SetStorageClass, Delete) and conditions (age, createdBefore, isLive, matchesStorageClass, etc.).

Common Conditions

Use `age`, `numNewerVersions`, `isLive`, and `matchesStorageClass` to target the right objects at the right time.

Exam Expectation

You may see JSON-like rules or descriptions and must decide how to transition or delete objects to meet retention and cost goals.

Lifecycle Rule Examples in JSON and gcloud

These examples show how you might configure lifecycle rules for common policies. You are not required to memorize exact syntax for the exam, but understanding the structure helps you reason about questions.

Example 1: Transition to Coldline after 30 days, Archive after 365 days

```json

{

"rule": [

{

"action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},

"condition": {"age": 30, "matchesStorageClass": ["STANDARD"]}

},

{

"action": {"type": "SetStorageClass", "storageClass": "ARCHIVE"},

"condition": {"age": 365, "matchesStorageClass": ["COLDLINE"]}

}

]

}

```

Example 2: Delete objects older than 730 days

```json

{

"rule": [

{

"action": {"type": "Delete"},

"condition": {"age": 730}

}

]

}

```

Using gcloud to set lifecycle configuration

```bash

Save your lifecycle JSON to a file

cat > lifecycle.json << 'EOF'

{

"rule": [

{

"action": {"type": "Delete"},

"condition": {"age": 365}

}

]

}

EOF

Apply lifecycle policy to a bucket

gcloud storage buckets update gs://my-bucket \

--lifecycle-file=lifecycle.json

```

On the exam, if you see `matchesStorageClass` in a rule, remember it prevents objects already in colder classes from being transitioned again until they meet the next rule's condition.

Design a Lifecycle Policy (Thought Exercise)

Imagine you are the Associate Cloud Engineer for a company that runs an online learning platform (similar to this course). You need to design a lifecycle policy for three types of data in a single bucket:

  1. Course videos
  • Frequently watched for the first 6 months after release.
  • After 6 months, views drop sharply but some students still access them occasionally.
  • Videos must be retained for at least 5 years for contractual reasons.
  1. Access logs
  • Generated continuously.
  • Used heavily for 14 days for troubleshooting and analytics.
  • After 14 days, rarely queried; kept for 1 year for security investigations.
  1. Database backups
  • Daily full backups.
  • Usually not restored unless there is an incident.
  • Must be retained for 7 years for compliance.

Your task (mentally or on paper):

  • For each data type, decide:
  1. Initial storage class.
  2. When to transition to colder classes.
  3. When (if ever) to delete.
  • Then, sketch lifecycle rules in plain language, for example:
  • "If object is a log file and age >= 15 days, change storage class to Coldline. If age >= 365 days, delete."

Think about:

  • Access frequency over time.
  • Retention requirements.
  • Avoiding unnecessary early deletion or overly frequent transitions.

Quiz 1: Storage Classes and Access Patterns

Test your understanding of Cloud Storage classes and when to use them.

A healthcare company stores medical images that must be retained for 10 years. Images are accessed frequently in the first 30 days, then only a few times per year after that. Which strategy best balances performance and cost?

  1. Store all images in Standard class in a multi-region bucket for 10 years.
  2. Store images in Standard for 30 days, then transition to Archive using lifecycle rules.
  3. Store images in Archive from day 1 to minimize storage cost.
  4. Store images in Nearline for 10 years, because it is cheaper than Standard and suitable for frequent access.
Show Answer

Answer: B) Store images in Standard for 30 days, then transition to Archive using lifecycle rules.

Option 2 is best: use Standard while access is frequent, then transition to Archive when access becomes rare. Option 1 is overly expensive for long-term storage. Option 3 would make early frequent access too costly due to retrieval charges. Option 4 misuses Nearline for frequent access; Nearline is intended for roughly once-a-month access.

Quiz 2: Lifecycle Rules and Locations

Check your understanding of lifecycle management and bucket locations.

You manage logs in a bucket located in `us-central1`. Logs are heavily queried for 7 days, then rarely accessed but must be kept for 180 days total. Which lifecycle configuration is most appropriate?

  1. No lifecycle rules; keep all logs in Standard for 180 days.
  2. After 7 days, SetStorageClass to Nearline; after 180 days, Delete objects.
  3. After 7 days, Delete objects; after 180 days, SetStorageClass to Archive.
  4. Immediately SetStorageClass to Archive; after 7 days, Delete objects.
Show Answer

Answer: B) After 7 days, SetStorageClass to Nearline; after 180 days, Delete objects.

Option 2 matches the requirements: keep logs hot in Standard for 7 days, then move to Nearline for cheaper storage while still allowing occasional access, and delete after 180 days. Option 1 is more expensive. Option 3 deletes too early. Option 4 uses Archive for hot data and deletes too early.

Planning Bucket Configurations for Region, Dual-region, and Multi-region

Region Buckets

Use region buckets when compute is in the same region and latency, cost, or data residency are priorities.

Dual-region Buckets

Dual-region buckets store data in two regions for high availability and near-zero RPO, ideal for critical production data.

Multi-region Buckets

Multi-region buckets spread data across multiple regions in a large area, great for global or continental content delivery.

Reading Scenarios

Look for cues like "regional outage", "global users", or "same region as VMs" to choose between region, dual-region, and multi-region.

Key Term Review: Cloud Storage and Lifecycle

Flip through these flashcards to reinforce core concepts before you move on.

Cloud Storage
Google Cloud's foundational object storage service for unstructured data, using buckets and objects with high durability and strong consistency.
Bucket
A top-level container in Cloud Storage that holds objects and defines location, default storage class, access control, and lifecycle rules.
Standard storage class
Cloud Storage class optimized for frequently accessed hot data, with low latency and high throughput.
Nearline storage class
Cloud Storage class for data accessed about once a month or less, with lower storage cost but higher access and retrieval costs than Standard.
Coldline storage class
Cloud Storage class for data accessed about once a quarter or less, with very low storage cost and higher access and retrieval costs.
Archive storage class
Cloud Storage class with the lowest storage cost, intended for data accessed less than once a year, with the highest access and retrieval costs.
Regional Persistent Disk
Block storage for Compute Engine and GKE, synchronously replicated across two zones in a region for high availability.
Lifecycle management
A Cloud Storage feature that automatically performs actions such as changing storage class or deleting objects based on age, version, or other conditions.
Region vs Dual-region vs Multi-region
Region: one location; Dual-region: two paired regions; Multi-region: multiple regions in a large geographic area for higher availability and broader access.
Data durability vs availability
Durability is the probability data is not lost; availability is the probability data is accessible when needed. Cloud Storage offers very high durability; availability depends on location and class.

Key Terms

Bucket
A top-level container in Cloud Storage that holds objects and defines location, default storage class, access control, and lifecycle rules.
Object
A piece of data stored in Cloud Storage, consisting of the data itself and associated metadata, identified by a unique object name within a bucket.
Durability
The probability that data is not lost over a given period; Cloud Storage is designed for 99.999999999% annual durability.
Availability
The probability that data is accessible when requested, influenced by bucket location and storage class.
Cloud Storage
Google Cloud's foundational object storage service for unstructured data, using buckets and objects with high durability and strong consistency.
Delete action
A lifecycle action that permanently removes an object from a Cloud Storage bucket when its conditions are met.
Region bucket
A bucket whose data is stored in a single Google Cloud region, offering low latency to resources in that region and supporting data residency requirements.
Lifecycle rule
A configuration element consisting of an action and conditions that determine when Cloud Storage automatically transitions or deletes objects.
SetStorageClass
A lifecycle action that changes an object's storage class, for example from Standard to Nearline, Coldline, or Archive.
Dual-region bucket
A bucket whose data is stored synchronously in two specific regions within the same continent, providing high availability and regional outage resilience.
Multi-region bucket
A bucket whose data is stored redundantly across multiple regions in a large geographic area such as us, eu, or asia, improving availability and access for distributed users.
Lifecycle management
A Cloud Storage feature that automatically performs actions such as changing storage class or deleting objects based on age, version, or other conditions.
Archive storage class
Cloud Storage class with the lowest storage cost, intended for data accessed less than once a year, with the highest access and retrieval costs.
Cloud Storage classes
Storage options within Cloud Storage, including Standard, Nearline, Coldline, and Archive, each optimized for different access patterns and costs.
Coldline storage class
Cloud Storage class for data accessed about once a quarter or less, with very low storage cost and higher access and retrieval costs.
Nearline storage class
Cloud Storage class for data accessed about once a month or less, with lower storage cost but higher access and retrieval costs than Standard.
Standard storage class
Cloud Storage class optimized for frequently accessed hot data, with low latency and high throughput.
Regional Persistent Disk
Block storage for Compute Engine and GKE, synchronously replicated across two zones in a region for high availability.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself