Compute Engine provides scalable virtual machines on Google’s infrastructure. Use it for custom software stacks, legacy applications, batch processing, and workloads requiring full OS control. With per-second billing, sustained use discounts, and Google’s global network, Compute Engine balances flexibility with cost efficiency.

Machine Types

Family Optimized For Examples Best For
E2 Cost-effective general purpose e2-medium, e2-standard-4 Web servers, dev/test
N2 Balanced performance n2-standard-4 Production apps
N2D AMD-based, cost-efficient n2d-standard-4 Cost-sensitive production
C2/C2D Compute-intensive c2-standard-8 Scientific computing
M1/M2/M3 Memory-intensive m2-megamem-416 In-memory databases
A2/A3 GPU / ML workloads a2-highgpu-1g Training, inference
T2D Scale-out workloads t2d-standard-4 Stateless microservices

GCP bills per second with sustained use discounts applied automatically (up to 30% for VMs running >25% of the month).

Create a VM

  gcloud compute instances create web-server-01 \
  --zone=us-central1-a \
  --machine-type=e2-medium \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud \
  --boot-disk-size=20GB \
  --boot-disk-type=pd-balanced \
  --tags=http-server \
  --metadata=enable-oslogin=TRUE

# Allow HTTP traffic
gcloud compute firewall-rules create allow-http \
  --allow tcp:80,tcp:443 \
  --target-tags=http-server \
  --source-ranges=0.0.0.0/0

# SSH with OS Login (no SSH keys on metadata)
gcloud compute ssh web-server-01 --zone=us-central1-a
  

Disks

Type Use Case IOPS Cost
Standard PD (pd-standard) Boot disks, dev/test Moderate Lowest
Balanced PD (pd-balanced) General production Good balance Medium
SSD PD (pd-ssd) Databases, high IOPS High Higher
Extreme PD (pd-extreme) Top-tier databases Configurable Highest
Local SSD Temporary, high-throughput cache Very high Ephemeral

Attach a persistent disk:

  gcloud compute disks create data-disk-01 \
  --size=100GB \
  --type=pd-balanced \
  --zone=us-central1-a

gcloud compute instances attach-disk web-server-01 \
  --disk=data-disk-01 \
  --zone=us-central1-a \
  --device-name=data-disk
  

Snapshots and Schedules

  # Manual snapshot
gcloud compute disks snapshot data-disk-01 \
  --snapshot-names=data-disk-01-$(date +%Y%m%d) \
  --zone=us-central1-a

# Snapshot schedule (daily at 2 AM, retain 7 days)
gcloud compute resource-policies create snapshot-schedule daily-backup \
  --max-retention-days=7 \
  --daily-schedule start-time=02:00 \
  --region=us-central1

gcloud compute disks add-resource-policies data-disk-01 \
  --resource-policies=daily-backup \
  --zone=us-central1-a
  

Managed Instance Groups (MIGs)

MIGs provide auto-healing and auto-scaling for identical VMs:

  gcloud compute instance-templates create web-template \
  --machine-type=e2-medium \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud \
  --tags=http-server \
  --metadata=startup-script='#!/bin/bash
    apt-get update && apt-get install -y nginx
    systemctl start nginx'

gcloud compute instance-groups managed create web-mig \
  --base-instance-name=web \
  --template=web-template \
  --size=3 \
  --zone=us-central1-a

gcloud compute instance-groups managed set-autoscaling web-mig \
  --max-num-replicas=10 \
  --min-num-replicas=2 \
  --target-cpu-utilization=0.7 \
  --zone=us-central1-a

# Rolling update with zero downtime
gcloud compute instance-groups managed rolling-action start-update web-mig \
  --version=template=web-template-v2 \
  --max-surge=3 --max-unavailable=0 \
  --zone=us-central1-a
  

Spot and Preemptible VMs

Type Discount Interruption Notice Use Case
Spot VM Up to 91% 30-second notice Batch, rendering, CI
Preemptible (legacy) Up to 80% 30-second notice Being replaced by Spot
  gcloud compute instances create batch-worker \
  --zone=us-central1-a \
  --machine-type=n2-standard-4 \
  --provisioning-model=SPOT \
  --instance-termination-action=DELETE \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud
  

Real-World Scenario: Web Tier Behind Load Balancer

  Internet → External HTTP(S) LB → Instance Group (MIG)
                                      ├── web-abc1 (zone a)
                                      ├── web-abc2 (zone b)
                                      └── web-abc3 (zone c)
  
  1. Create instance template with startup script
  2. Create regional MIG across three zones
  3. Configure health check on /health
  4. Attach MIG to backend service of HTTP(S) load balancer
  5. Enable Cloud CDN for static assets

Common Mistakes

Mistake Impact Fix
Single-zone deployment Zone outage = downtime Regional MIG across zones
pd-standard for databases Poor IOPS, latency spikes Use pd-ssd or pd-balanced
No snapshot schedule Data loss on disk failure Automated snapshot policies
External IP on every VM Attack surface, cost Use Cloud NAT for outbound only
Ignoring sustained use Missing automatic savings Run steady workloads on same VM type

Best Practices

  • Use OS Config for patch management and compliance scanning
  • Enable Cloud Monitoring Ops Agent for metrics and logs
  • Apply firewall rules with least-privilege source ranges
  • Use preemptible/spot VMs for batch workloads (up to 91% cheaper)
  • Snapshot disks regularly with snapshot schedules
  • Prefer custom images for repeatable deployments (Packer + gcloud compute images create)
  • Use OS Login instead of project-wide SSH keys
  • Right-size with Recommender idle VM and rightsizing suggestions

Troubleshooting

VM won’t start — quota exceeded:

  gcloud compute project-info describe --project=learning-gcp-dev
# Request quota increase in Console → IAM & Admin → Quotas
  

SSH connection refused:

  gcloud compute instances describe web-server-01 --zone=us-central1-a \
  --format="get(networkInterfaces[0].accessConfigs[0].natIP)"
# Check firewall allows tcp:22 from your IP
gcloud compute firewall-rules list --filter="name~allow-ssh"
  

MIG instances unhealthy:

  gcloud compute instance-groups managed list-instances web-mig --zone=us-central1-a
gcloud compute health-checks describe http-health-check
# Review startup script logs: sudo journalctl -u google-startup-scripts
  

Disk full:

  gcloud compute disks resize data-disk-01 --size=200GB --zone=us-central1-a
# Then extend filesystem inside VM: sudo resize2fs /dev/sdb1
  

Compute Engine offers VM flexibility with Google’s network performance and per-second billing economics.

Next: Cloud Storage — object storage and lifecycle management.