SkarpSkarp

Chapter 13 of 26

Deploying Serverless Containers and Functions: Cloud Run and Cloud Functions

Ship code quickly with Cloud Run and Cloud Functions, configuring deployments, traffic splitting, triggers, and autoscaling behavior.

27 min readen

Serverless on Google Cloud: Cloud Run vs Cloud Functions

From Servers to Serverless

Cloud Run and Cloud Functions let you run code without managing VMs or clusters. You focus on code or containers; Google Cloud handles provisioning, scaling, and infrastructure.

Cloud Run: Containers

Cloud Run runs container images as services. You deploy a container, get an HTTPS endpoint, and configure concurrency, autoscaling, and traffic splitting between revisions.

Cloud Functions: Functions

Cloud Functions runs single-purpose functions in supported runtimes. Functions can be HTTP-triggered or event-triggered from services like Pub/Sub and Cloud Storage.

Exam-Relevant Skills

For the Associate Cloud Engineer exam, know how to deploy to Cloud Run, configure revisions and traffic, deploy Cloud Functions with triggers, and understand autoscaling behavior.

Core Concepts: Services, Revisions, Triggers, and Identity

Cloud Run Concepts

Cloud Run has services and revisions. A service has a stable URL and config. Each deployment creates a new immutable revision that can receive a percentage of traffic.

Cloud Functions Concepts

Cloud Functions deploys functions with entry points. Triggers define how they run: HTTP, Pub/Sub, Cloud Storage, and more via Eventarc in 2nd gen.

IAM and Service Accounts

IAM controls who can deploy or invoke. A service account is a special kind of account used by a workload to call APIs and access resources securely.

Two Security Questions

Always ask: 1) Who can invoke this Cloud Run service or function? 2) When it runs as its service account, what resources and APIs is it allowed to access?

Deploying a Container to Cloud Run (Console and gcloud)

Scenario Setup

You have a web app container in Artifact Registry and must deploy it to Cloud Run in a specific region with minimal downtime and a stable HTTPS URL.

Console Deployment Steps

In Cloud Run: Create service, pick region, specify container image, choose public or private access, configure resources and instance limits, then create the service.

gcloud Deployment: Public

Use: `gcloud run deploy my-service --image=REGION-docker.pkg.dev/PROJECT/repo/app:v1 --region=us-central1 --platform=managed --allow-unauthenticated`.

gcloud Deployment: Private

Make it private by omitting `--allow-unauthenticated` or using `--no-allow-unauthenticated`. Invocations then require IAM auth or an authenticated proxy.

Cloud Run Revisions and Traffic Splitting

What Is a Revision?

A Cloud Run revision is an immutable snapshot of code plus configuration. Every deployment that changes config or image creates a new revision for the same service.

Splitting Traffic

A service can send percentages of traffic to different revisions, for example 90% to stable v1 and 10% to canary v2, adjusted via Console or gcloud.

gcloud Traffic Command

Example: `gcloud run services update-traffic my-service --region=us-central1 --to-revisions old-rev=90,new-rev=10` sets a 90/10 split between two revisions.

Rollback Strategy

To roll back, move 100% traffic back to the old revision. You do not edit revisions; you only change which revision receives traffic.

Cloud Run Autoscaling and Concurrency

What Is Concurrency?

Concurrency is how many requests a single Cloud Run container instance can handle at once. Default is 80; you can lower it for isolation or raise it for efficiency.

Scaling Up and Down

Cloud Run adds instances when existing ones hit concurrency limits and removes idle instances, possibly scaling to zero if min-instances is 0.

Tuning Autoscaling

Use `--concurrency`, `--min-instances`, and `--max-instances` in gcloud deploy to balance latency, cold starts, and backend protection.

Common Exam Pitfalls

Remember: Cloud Run can scale to zero, concurrency is per instance, and min-instances > 0 reduces cold starts but increases baseline cost.

Deploying Cloud Functions (2nd gen) with HTTP and Background Triggers

HTTP Cloud Function

Use HTTP-triggered Cloud Functions when you want a simple HTTPS endpoint. Deploy with `--trigger-http` and optionally `--allow-unauthenticated`.

Pub/Sub-Triggered Function

For async processing, deploy a 2nd gen function with `--trigger-topic=my-topic`. Pub/Sub messages then invoke your function automatically.

Cloud Storage-Triggered Function

To react to file uploads, use 2nd gen event filters like `type=google.cloud.storage.object.v1.finalized` and `bucket=my-bucket` in the deploy command.

Eventarc Under the Hood

In 2nd gen, background triggers use Eventarc. On the exam, link storage or other events to Cloud Functions or Cloud Run via Eventarc-based triggers.

Event-Driven Architectures with Pub/Sub, Cloud Storage, and Eventarc

Pub/Sub Patterns

Publish messages to a Pub/Sub topic and process them asynchronously with Cloud Functions or Cloud Run, decoupling producers and consumers.

Cloud Storage Events

Use object create/update/delete events to trigger functions or Cloud Run services for tasks like image processing or metadata extraction.

Role of Eventarc

Eventarc connects event sources like Cloud Storage or Audit Logs to Cloud Run and 2nd gen Cloud Functions, with powerful filtering options.

Choosing the Right Service

Pick Cloud Functions for simple, code-only handlers; choose Cloud Run for containerized apps, custom runtimes, or advanced resource control.

Security and Identity for Cloud Run and Cloud Functions

Invoker Permissions

Control who can call Cloud Run or HTTP Cloud Functions using roles like `run.invoker` or `cloudfunctions.invoker`, granted to specific identities or allUsers.

Runtime Permissions

The code runs as a service account. Grant that account only the roles it needs, such as Pub/Sub publisher or Storage object admin, on specific resources.

Avoid Over-Privileged Defaults

Default service accounts may have broad permissions. Prefer dedicated service accounts per service/function with least-privilege IAM roles.

Think in Two Layers

For each service or function, ask: who is allowed to invoke it, and what is it allowed to do when it runs as its service account?

Design Exercise: Choose Cloud Run or Cloud Functions

Work through these scenarios mentally. For each, decide: Cloud Run or Cloud Functions (2nd gen), and why. Then compare your reasoning to the guidance.

  1. Image thumbnail generation
  • Scenario: Whenever a user uploads an image to a Cloud Storage bucket, you want to generate a thumbnail and store it in another bucket. The code is a small Python script.
  • Recommended: Cloud Functions 2nd gen with a Cloud Storage finalize event trigger. It is lightweight, event-driven, and you do not need a custom runtime.
  1. Existing containerized API
  • Scenario: Your team already has a containerized Node.js REST API that runs on-prem. You want to move it to Google Cloud with minimal changes and expose it over HTTPS.
  • Recommended: Cloud Run. You already have a container; Cloud Run gives you HTTP endpoint, autoscaling, and traffic splitting without rewriting into functions.
  1. Custom binary processing
  • Scenario: A workload needs a custom OS package and binary tool not available in standard function runtimes. It is triggered by Pub/Sub messages.
  • Recommended: Cloud Run with an Eventarc Pub/Sub trigger. You can build a custom container image with the tools you need.
  1. Webhook receiver
  • Scenario: You need to receive occasional webhooks from a third-party SaaS product. Traffic is low and bursty.
  • Recommended: Either works, but HTTP Cloud Function 2nd gen is usually simpler. If you need more control over runtime or networking, choose Cloud Run.

As you practice, tie each choice back to: packaging model (function vs container), trigger type, runtime requirements, and configuration control (CPU, memory, networking).

Quiz: Cloud Run Deployments and Traffic

Test your understanding of Cloud Run services, revisions, and traffic splitting.

You deployed a new version of a Cloud Run service, but users are still hitting the old version. You want to send 20% of traffic to the new revision and keep 80% on the old one. Which action is most appropriate?

  1. Update the container image in the existing revision and restart the service.
  2. Use `gcloud run services update-traffic` to split traffic 80/20 between the two revisions.
  3. Increase the max instances so the new revision can scale up.
  4. Create a new Cloud Run service with the new image and configure a separate HTTP(S) Load Balancer.
Show Answer

Answer: B) Use `gcloud run services update-traffic` to split traffic 80/20 between the two revisions.

Cloud Run creates a new immutable revision for each deployment. To control which revision receives requests, you adjust traffic at the service level. The correct approach is to use `gcloud run services update-traffic` (or the Console) to send 80% of traffic to the old revision and 20% to the new one. You cannot edit an existing revision in place, and you do not need a separate HTTP(S) Load Balancer for basic Cloud Run traffic management.

Quiz: Autoscaling and Triggers

Check your understanding of autoscaling behavior and event triggers.

A team complains that their Cloud Run service experiences high latency on the first request after periods of no traffic. They want to reduce this "cold start" impact. Which configuration change is most appropriate?

  1. Decrease concurrency from 80 to 1.
  2. Increase min-instances from 0 to a small positive number.
  3. Disable autoscaling for the service.
  4. Set max-instances to a very high value.
Show Answer

Answer: B) Increase min-instances from 0 to a small positive number.

Cold starts happen when Cloud Run has scaled down to zero instances and must start a new container to handle a request. Increasing `min-instances` keeps a baseline number of warm instances, reducing cold start latency. Decreasing concurrency does not directly fix cold starts, disabling autoscaling is not an option for Cloud Run fully managed, and increasing max-instances only affects scaling under load, not idle behavior.

Key Term Review: Cloud Run and Cloud Functions

Use these flashcards to reinforce core concepts and definitions.

Cloud Run service
A regional, fully managed serverless resource that runs a container image and exposes a stable HTTPS endpoint. It consists of one or more immutable revisions, and you can configure autoscaling, concurrency, and traffic splitting at the service level.
Cloud Run revision
An immutable snapshot of a Cloud Run service's container image and configuration (environment variables, resources, concurrency). Each deployment that changes code or config creates a new revision, and traffic can be routed across revisions.
Cloud Functions trigger
The mechanism that invokes a Cloud Function, such as an HTTP(S) request, a Pub/Sub message, a Cloud Storage object event, or other Eventarc-supported events. In 2nd gen, triggers are defined via event types and filters.
Concurrency (Cloud Run)
The number of simultaneous requests that a single Cloud Run container instance can handle. Default is 80; you can adjust it to trade off between isolation and resource efficiency.
Identity and Access Management (IAM)
Identity and Access Management (IAM) lets you manage access control by defining who (identity) has what access (role) for which resource.
service account
A service account is a special kind of account used by an application or compute workload, not a person, to make authorized API calls and access Google Cloud resources.
Eventarc
A Google Cloud service that routes events from many sources (such as Cloud Storage, Pub/Sub, and Audit Logs) to Cloud Run services and Cloud Functions 2nd gen, with support for filtering by event type and attributes.
Cloud Functions 2nd gen
The newer generation of Cloud Functions that runs on top of Cloud Run and Eventarc, providing better concurrency, configuration options, and integration with event sources compared to 1st gen.
Invoker IAM role (Cloud Run)
`roles/run.invoker`, which controls who is allowed to send requests to a Cloud Run service. Grant it to identities like users, groups, or service accounts to allow invocation.
Invoker IAM role (Cloud Functions)
`roles/cloudfunctions.invoker`, which controls who can invoke an HTTP Cloud Function or trigger its execution.

Key Terms

Pub/Sub
Google Cloud's asynchronous messaging service that decouples senders and receivers using topics and subscriptions, often used with Cloud Functions and Cloud Run in event-driven architectures.
Eventarc
A Google Cloud event routing service that delivers events from various sources to Cloud Run and Cloud Functions 2nd gen, with flexible event type and attribute filtering.
Cloud Run
A fully managed, regional serverless platform that runs containerized applications, automatically handling provisioning, autoscaling, and traffic management, and exposing services via HTTPS endpoints.
run.invoker
An IAM role (roles/run.invoker) that grants permission to invoke a Cloud Run service.
Cloud Functions
A serverless compute service for running single-purpose functions in supported runtimes, triggered by HTTP(S) requests or events from services such as Pub/Sub and Cloud Storage.
service account
A service account is a special kind of account used by an application or compute workload, not a person, to make authorized API calls and access Google Cloud resources.
Cloud Storage event
A notification generated when an object in a Cloud Storage bucket is created, updated, or deleted, which can trigger Cloud Functions or Cloud Run services via Eventarc.
Revision (Cloud Run)
An immutable version of a Cloud Run service that captures a specific container image and configuration; new revisions are created on deployment when configuration or code changes.
cloudfunctions.invoker
An IAM role (roles/cloudfunctions.invoker) that grants permission to invoke a Cloud Function.
Cloud Functions 2nd gen
The current generation of Cloud Functions that runs on Cloud Run and Eventarc, offering improved performance, concurrency, and configuration options for both HTTP and event-driven functions.
Concurrency (Cloud Run)
The maximum number of simultaneous requests a single Cloud Run container instance can handle; defaults to 80 and can be configured per service or revision.
Traffic splitting (Cloud Run)
A mechanism to distribute incoming requests across multiple revisions of a Cloud Run service using percentage-based allocations, enabling canary and gradual rollouts.
Identity and Access Management (IAM)
Identity and Access Management (IAM) lets you manage access control by defining who (identity) has what access (role) for which resource.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself