Google Kubernetes Engine
Google Kubernetes Engine (GKE) is Google’s managed Kubernetes service. Google operates the control plane; you manage node pools and workloads. GKE is the reference implementation of Kubernetes — the platform Kubernetes was originally built on. For containerized workloads at scale, GKE offers the deepest GCP integration for identity, networking, and observability.
GKE Modes
| Mode | Control Plane | Node Management | Use Case |
|---|---|---|---|
| Standard | Google-managed | You manage node pools | Full control, custom nodes, GPU |
| Autopilot | Google-managed | Fully managed | Hands-off, pay per pod resources |
| GKE Enterprise | Multi-cluster fleet | Advanced features | Large-scale, multi-cluster ops |
Standard vs. Autopilot
| Criteria | Standard | Autopilot |
|---|---|---|
| Node management | You configure pools | Google manages everything |
| Pricing | Per node (VM cost) | Per pod CPU/memory request |
| Customization | Full (taints, GPU, local SSD) | Limited to supported configs |
| Security | Your responsibility | Hardened by default |
| Best for | GPU, specialized hardware | Most stateless workloads |
Create a Standard Cluster
gcloud container clusters create learning-cluster \
--zone=us-central1-a \
--num-nodes=2 \
--machine-type=e2-medium \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=5 \
--enable-ip-alias \
--workload-pool=learning-gcp-dev.svc.id.goog \
--enable-shielded-nodes \
--release-channel=regular
gcloud container clusters get-credentials learning-cluster \
--zone=us-central1-a
kubectl get nodes
Regional Cluster (Production)
gcloud container clusters create prod-cluster \
--region=us-central1 \
--num-nodes=1 \
--machine-type=e2-standard-4 \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=10 \
--enable-ip-alias \
--workload-pool=learning-gcp-dev.svc.id.goog \
--release-channel=stable \
--enable-network-policy
Regional clusters distribute nodes across three zones for zone-level fault tolerance.
Deploy a Workload
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
serviceAccountName: web-app-sa
containers:
- name: web-app
image: us-central1-docker.pkg.dev/learning-gcp-dev/app/web:v1
ports:
- containerPort: 8080
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Workload Identity
Bind Kubernetes service accounts to GCP service accounts — no key files:
# Create GCP service account
gcloud iam service-accounts create web-app-gcp
# Grant GCP permissions
gcloud projects add-iam-policy-binding learning-gcp-dev \
--member="serviceAccount:[email protected]" \
--role="roles/cloudsql.client"
# Bind K8s SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
[email protected] \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:learning-gcp-dev.svc.id.goog[default/web-app-sa]"
kubectl annotate serviceaccount web-app-sa \
iam.gke.io/gcp-service-account=web-app-gcp@learning-gcp-dev.iam.gserviceaccount.com
Autoscaling
| Scaler | Scales | Trigger |
|---|---|---|
| HPA (Horizontal Pod Autoscaler) | Pod replicas | CPU, memory, custom metrics |
| VPA (Vertical Pod Autoscaler) | Pod resources | Historical usage |
| Cluster Autoscaler | Nodes | Pending pods cannot schedule |
| KEDA | Pod replicas | External events (Pub/Sub, etc.) |
kubectl autoscale deployment web-app \
--cpu-percent=70 --min=2 --max=10
Networking
| Component | Purpose |
|---|---|
| GKE Ingress | HTTP(S) routing with Cloud Load Balancing |
| Gateway API | Next-gen ingress (recommended for new deployments) |
| Network Policies | Pod-to-pod firewall rules |
| Private clusters | Nodes have only private IPs |
| Cloud Service Mesh | mTLS, traffic management (GKE Enterprise) |
# Private cluster (nodes not reachable from internet)
gcloud container clusters create private-cluster \
--region=us-central1 \
--enable-private-nodes \
--master-ipv4-cidr=172.16.0.0/28 \
--enable-ip-alias
Real-World Scenario: Production Microservices
A fintech platform runs 15 microservices on GKE:
- Regional cluster across
us-central1(3 zones) - Separate node pools:
general(e2-standard-4),memory(n2-highmem-4),gpu(a2-highgpu-1g) - Workload Identity for Cloud SQL, Secret Manager, Pub/Sub access
- Gateway API for ingress with Cloud Armor WAF
- HPA on all deployments; Cluster Autoscaler for node pools
- Backup for GKE for etcd snapshots
- Managed Prometheus for metrics; Cloud Trace for distributed tracing
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Zonal cluster in production | Zone outage = full downtime | Regional cluster |
| No resource requests/limits | Noisy neighbor, OOM kills | Set requests and limits on all pods |
latest image tag |
Unpredictable deployments | Pin image tags or digests |
| SA keys mounted in pods | Credential leakage | Workload Identity |
| No network policies | Any pod talks to any pod | Implement default-deny policies |
Best Practices
- Use Artifact Registry for container images with vulnerability scanning
- Enable GKE release channels (Regular or Stable) for managed upgrades
- Apply Pod Security Standards or Pod Security Admission
- Use Backup for GKE for etcd and application state
- Monitor with Cloud Monitoring GKE dashboards and Managed Prometheus
- Run node auto-repair and auto-upgrade for node health
- Use Workload Identity instead of service account keys
- Implement readiness and liveness probes on every deployment
- Set PodDisruptionBudgets for critical services during node maintenance
Troubleshooting
Pods stuck in Pending:
kubectl describe pod POD_NAME # Check events
kubectl get nodes # Verify nodes are Ready
# Common causes: insufficient CPU/memory, taints, image pull errors
Image pull errors:
kubectl describe pod POD_NAME | grep -A5 "Failed"
# Verify Artifact Registry permissions for node SA
gcloud artifacts repositories list
Workload Identity not working:
kubectl describe sa web-app-sa | grep gcp-service-account
# Verify annotation matches GCP SA email exactly
Node NotReady:
kubectl describe node NODE_NAME
gcloud compute instances describe NODE_NAME --zone=ZONE
# Check disk space, kubelet logs: journalctl -u kubelet
GKE combines Kubernetes portability with deep GCP integration for identity, networking, and observability.
Next: Architecture Best Practices — reliability, security, and design patterns.