Compute Engine
Difficulty: medium
Overview
Google Compute Engine (GCE) provides scalable virtual machines on Google's global infrastructure.
Machine Families:
| Family | Best For |
|---|---|
| E2 | Cost-optimized general purpose |
| N2/N2D | Balanced price-performance (recommended) |
| C2/C3 | Compute-intensive (high per-core perf) |
| M2/M3 | Memory-optimized (in-memory DBs, SAP HANA) |
| A2/A3 | GPU-based (ML training, HPC) |
VM Pricing Models:
- On-demand: Standard rate, no commitment.
- Sustained Use Discounts (SUDs): Automatic discount up to 30% for VMs running most of the month.
- Committed Use Discounts (CUDs): 1 or 3-year resource commitment for 20–57% savings.
- Spot VMs (Preemptible): Up to 91% cheaper; can be stopped by Google with 30-second notice when capacity is needed. Ideal for fault-tolerant, batch, or stateless workloads.
Instance Groups:
- Managed Instance Group (MIG): Creates and manages identical VMs from an Instance Template.
- Supports autoscaling (CPU, HTTP LB utilization, Pub/Sub queue depth, custom metrics).
- Supports autohealing (replaces VMs that fail health checks).
- Supports rolling updates (zero-downtime deployments).
- Recommended for all production workloads.
- Unmanaged Instance Group: Manual collection of heterogeneous VMs. No autoscaling or autohealing. Legacy use only.
Instance Templates: Immutable blueprint (machine type, boot disk, network, startup script) used by MIGs.
Metadata Server: All GCE VMs access http://metadata.google.internal to retrieve instance metadata and service account tokens — enables credential-free API access directly from code.
Live Migration: GCE transparently migrates running VMs to healthy hosts during hardware maintenance — no reboots, no downtime.
Practice Linked Questions
Q1. A data pipeline runs nightly batch jobs that can be restarted from a checkpoint if interrupted. Which Compute Engine option provides the lowest cost for this workload?
Select one answer before revealing.
Q2. A company runs a stateless web application on Compute Engine VMs. Traffic spikes 10x during business hours and drops overnight. The architecture must automatically scale, replace unhealthy VMs, and distribute traffic. Which GCP architecture achieves this?
Select one answer before revealing.