Compute Engine
Compute Engine provides scalable virtual machines on Google’s infrastructure. Use it for custom software stacks, legacy applications, batch processing, and workloads requiring full OS control. With per-second billing, sustained use discounts, and Google’s global network, Compute Engine balances flexibility with cost efficiency.
Machine Types
| Family | Optimized For | Examples | Best For |
|---|---|---|---|
| E2 | Cost-effective general purpose | e2-medium, e2-standard-4 | Web servers, dev/test |
| N2 | Balanced performance | n2-standard-4 | Production apps |
| N2D | AMD-based, cost-efficient | n2d-standard-4 | Cost-sensitive production |
| C2/C2D | Compute-intensive | c2-standard-8 | Scientific computing |
| M1/M2/M3 | Memory-intensive | m2-megamem-416 | In-memory databases |
| A2/A3 | GPU / ML workloads | a2-highgpu-1g | Training, inference |
| T2D | Scale-out workloads | t2d-standard-4 | Stateless microservices |
GCP bills per second with sustained use discounts applied automatically (up to 30% for VMs running >25% of the month).
Create a VM
gcloud compute instances create web-server-01 \
--zone=us-central1-a \
--machine-type=e2-medium \
--image-family=ubuntu-2204-lts \
--image-project=ubuntu-os-cloud \
--boot-disk-size=20GB \
--boot-disk-type=pd-balanced \
--tags=http-server \
--metadata=enable-oslogin=TRUE
# Allow HTTP traffic
gcloud compute firewall-rules create allow-http \
--allow tcp:80,tcp:443 \
--target-tags=http-server \
--source-ranges=0.0.0.0/0
# SSH with OS Login (no SSH keys on metadata)
gcloud compute ssh web-server-01 --zone=us-central1-a
Disks
| Type | Use Case | IOPS | Cost |
|---|---|---|---|
| Standard PD (pd-standard) | Boot disks, dev/test | Moderate | Lowest |
| Balanced PD (pd-balanced) | General production | Good balance | Medium |
| SSD PD (pd-ssd) | Databases, high IOPS | High | Higher |
| Extreme PD (pd-extreme) | Top-tier databases | Configurable | Highest |
| Local SSD | Temporary, high-throughput cache | Very high | Ephemeral |
Attach a persistent disk:
gcloud compute disks create data-disk-01 \
--size=100GB \
--type=pd-balanced \
--zone=us-central1-a
gcloud compute instances attach-disk web-server-01 \
--disk=data-disk-01 \
--zone=us-central1-a \
--device-name=data-disk
Snapshots and Schedules
# Manual snapshot
gcloud compute disks snapshot data-disk-01 \
--snapshot-names=data-disk-01-$(date +%Y%m%d) \
--zone=us-central1-a
# Snapshot schedule (daily at 2 AM, retain 7 days)
gcloud compute resource-policies create snapshot-schedule daily-backup \
--max-retention-days=7 \
--daily-schedule start-time=02:00 \
--region=us-central1
gcloud compute disks add-resource-policies data-disk-01 \
--resource-policies=daily-backup \
--zone=us-central1-a
Managed Instance Groups (MIGs)
MIGs provide auto-healing and auto-scaling for identical VMs:
gcloud compute instance-templates create web-template \
--machine-type=e2-medium \
--image-family=ubuntu-2204-lts \
--image-project=ubuntu-os-cloud \
--tags=http-server \
--metadata=startup-script='#!/bin/bash
apt-get update && apt-get install -y nginx
systemctl start nginx'
gcloud compute instance-groups managed create web-mig \
--base-instance-name=web \
--template=web-template \
--size=3 \
--zone=us-central1-a
gcloud compute instance-groups managed set-autoscaling web-mig \
--max-num-replicas=10 \
--min-num-replicas=2 \
--target-cpu-utilization=0.7 \
--zone=us-central1-a
# Rolling update with zero downtime
gcloud compute instance-groups managed rolling-action start-update web-mig \
--version=template=web-template-v2 \
--max-surge=3 --max-unavailable=0 \
--zone=us-central1-a
Spot and Preemptible VMs
| Type | Discount | Interruption Notice | Use Case |
|---|---|---|---|
| Spot VM | Up to 91% | 30-second notice | Batch, rendering, CI |
| Preemptible (legacy) | Up to 80% | 30-second notice | Being replaced by Spot |
gcloud compute instances create batch-worker \
--zone=us-central1-a \
--machine-type=n2-standard-4 \
--provisioning-model=SPOT \
--instance-termination-action=DELETE \
--image-family=ubuntu-2204-lts \
--image-project=ubuntu-os-cloud
Real-World Scenario: Web Tier Behind Load Balancer
Internet → External HTTP(S) LB → Instance Group (MIG)
├── web-abc1 (zone a)
├── web-abc2 (zone b)
└── web-abc3 (zone c)
- Create instance template with startup script
- Create regional MIG across three zones
- Configure health check on
/health - Attach MIG to backend service of HTTP(S) load balancer
- Enable Cloud CDN for static assets
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Single-zone deployment | Zone outage = downtime | Regional MIG across zones |
| pd-standard for databases | Poor IOPS, latency spikes | Use pd-ssd or pd-balanced |
| No snapshot schedule | Data loss on disk failure | Automated snapshot policies |
| External IP on every VM | Attack surface, cost | Use Cloud NAT for outbound only |
| Ignoring sustained use | Missing automatic savings | Run steady workloads on same VM type |
Best Practices
- Use OS Config for patch management and compliance scanning
- Enable Cloud Monitoring Ops Agent for metrics and logs
- Apply firewall rules with least-privilege source ranges
- Use preemptible/spot VMs for batch workloads (up to 91% cheaper)
- Snapshot disks regularly with snapshot schedules
- Prefer custom images for repeatable deployments (Packer +
gcloud compute images create) - Use OS Login instead of project-wide SSH keys
- Right-size with Recommender idle VM and rightsizing suggestions
Troubleshooting
VM won’t start — quota exceeded:
gcloud compute project-info describe --project=learning-gcp-dev
# Request quota increase in Console → IAM & Admin → Quotas
SSH connection refused:
gcloud compute instances describe web-server-01 --zone=us-central1-a \
--format="get(networkInterfaces[0].accessConfigs[0].natIP)"
# Check firewall allows tcp:22 from your IP
gcloud compute firewall-rules list --filter="name~allow-ssh"
MIG instances unhealthy:
gcloud compute instance-groups managed list-instances web-mig --zone=us-central1-a
gcloud compute health-checks describe http-health-check
# Review startup script logs: sudo journalctl -u google-startup-scripts
Disk full:
gcloud compute disks resize data-disk-01 --size=200GB --zone=us-central1-a
# Then extend filesystem inside VM: sudo resize2fs /dev/sdb1
Compute Engine offers VM flexibility with Google’s network performance and per-second billing economics.
Next: Cloud Storage — object storage and lifecycle management.