to navigate

to select

to close

On this page

Google Kubernetes Engine

Google Kubernetes Engine (GKE) is Google’s managed Kubernetes service. Google operates the control plane; you manage node pools and workloads. GKE is the reference implementation of Kubernetes — the platform Kubernetes was originally built on. For containerized workloads at scale, GKE offers the deepest GCP integration for identity, networking, and observability.

GKE Modes

Mode	Control Plane	Node Management	Use Case
Standard	Google-managed	You manage node pools	Full control, custom nodes, GPU
Autopilot	Google-managed	Fully managed	Hands-off, pay per pod resources
GKE Enterprise	Multi-cluster fleet	Advanced features	Large-scale, multi-cluster ops

Standard vs. Autopilot

Criteria	Standard	Autopilot
Node management	You configure pools	Google manages everything
Pricing	Per node (VM cost)	Per pod CPU/memory request
Customization	Full (taints, GPU, local SSD)	Limited to supported configs
Security	Your responsibility	Hardened by default
Best for	GPU, specialized hardware	Most stateless workloads

Create a Standard Cluster

  gcloud container clusters create learning-cluster \
  --zone=us-central1-a \
  --num-nodes=2 \
  --machine-type=e2-medium \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=5 \
  --enable-ip-alias \
  --workload-pool=learning-gcp-dev.svc.id.goog \
  --enable-shielded-nodes \
  --release-channel=regular

gcloud container clusters get-credentials learning-cluster \
  --zone=us-central1-a

kubectl get nodes

Regional Cluster (Production)

  gcloud container clusters create prod-cluster \
  --region=us-central1 \
  --num-nodes=1 \
  --machine-type=e2-standard-4 \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=10 \
  --enable-ip-alias \
  --workload-pool=learning-gcp-dev.svc.id.goog \
  --release-channel=stable \
  --enable-network-policy

Regional clusters distribute nodes across three zones for zone-level fault tolerance.

Deploy a Workload

  apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      serviceAccountName: web-app-sa
      containers:
      - name: web-app
        image: us-central1-docker.pkg.dev/learning-gcp-dev/app/web:v1
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 250m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15

  kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

Workload Identity

Bind Kubernetes service accounts to GCP service accounts — no key files:

  # Create GCP service account
gcloud iam service-accounts create web-app-gcp

# Grant GCP permissions
gcloud projects add-iam-policy-binding learning-gcp-dev \
  --member="serviceAccount:[email protected]" \
  --role="roles/cloudsql.client"

# Bind K8s SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
  [email protected] \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:learning-gcp-dev.svc.id.goog[default/web-app-sa]"

kubectl annotate serviceaccount web-app-sa \
  iam.gke.io/gcp-service-account=web-app-gcp@learning-gcp-dev.iam.gserviceaccount.com

Autoscaling

Scaler	Scales	Trigger
HPA (Horizontal Pod Autoscaler)	Pod replicas	CPU, memory, custom metrics
VPA (Vertical Pod Autoscaler)	Pod resources	Historical usage
Cluster Autoscaler	Nodes	Pending pods cannot schedule
KEDA	Pod replicas	External events (Pub/Sub, etc.)

  kubectl autoscale deployment web-app \
  --cpu-percent=70 --min=2 --max=10

Networking

Component	Purpose
GKE Ingress	HTTP(S) routing with Cloud Load Balancing
Gateway API	Next-gen ingress (recommended for new deployments)
Network Policies	Pod-to-pod firewall rules
Private clusters	Nodes have only private IPs
Cloud Service Mesh	mTLS, traffic management (GKE Enterprise)

  # Private cluster (nodes not reachable from internet)
gcloud container clusters create private-cluster \
  --region=us-central1 \
  --enable-private-nodes \
  --master-ipv4-cidr=172.16.0.0/28 \
  --enable-ip-alias

Real-World Scenario: Production Microservices

A fintech platform runs 15 microservices on GKE:

Regional cluster across us-central1 (3 zones)
Separate node pools: general (e2-standard-4), memory (n2-highmem-4), gpu (a2-highgpu-1g)
Workload Identity for Cloud SQL, Secret Manager, Pub/Sub access
Gateway API for ingress with Cloud Armor WAF
HPA on all deployments; Cluster Autoscaler for node pools
Backup for GKE for etcd snapshots
Managed Prometheus for metrics; Cloud Trace for distributed tracing

Common Mistakes

Mistake	Impact	Fix
Zonal cluster in production	Zone outage = full downtime	Regional cluster
No resource requests/limits	Noisy neighbor, OOM kills	Set requests and limits on all pods
`latest` image tag	Unpredictable deployments	Pin image tags or digests
SA keys mounted in pods	Credential leakage	Workload Identity
No network policies	Any pod talks to any pod	Implement default-deny policies

Best Practices

Use Artifact Registry for container images with vulnerability scanning
Enable GKE release channels (Regular or Stable) for managed upgrades
Apply Pod Security Standards or Pod Security Admission
Use Backup for GKE for etcd and application state
Monitor with Cloud Monitoring GKE dashboards and Managed Prometheus
Run node auto-repair and auto-upgrade for node health
Use Workload Identity instead of service account keys
Implement readiness and liveness probes on every deployment
Set PodDisruptionBudgets for critical services during node maintenance

Troubleshooting

Pods stuck in Pending:

  kubectl describe pod POD_NAME  # Check events
kubectl get nodes              # Verify nodes are Ready
# Common causes: insufficient CPU/memory, taints, image pull errors

Image pull errors:

  kubectl describe pod POD_NAME | grep -A5 "Failed"
# Verify Artifact Registry permissions for node SA
gcloud artifacts repositories list

Workload Identity not working:

  kubectl describe sa web-app-sa | grep gcp-service-account
# Verify annotation matches GCP SA email exactly

Node NotReady:

  kubectl describe node NODE_NAME
gcloud compute instances describe NODE_NAME --zone=ZONE
# Check disk space, kubelet logs: journalctl -u kubelet

GKE combines Kubernetes portability with deep GCP integration for identity, networking, and observability.

Next: Architecture Best Practices — reliability, security, and design patterns.

Cloud Monitoring

Architecture Best Practices

Google Kubernetes Engine

GKE Modes link

Standard vs. Autopilot link

Create a Standard Cluster link

Regional Cluster (Production) link

Deploy a Workload link

Workload Identity link

Autoscaling link

Networking link

Real-World Scenario: Production Microservices link

Common Mistakes link

Best Practices link

Troubleshooting link