GCP costs can grow without governance. Google provides tools and pricing models to forecast, monitor, and optimize spending across projects and teams. Cost optimization is not a one-time exercise — it is a continuous practice shared between engineering and finance, often called FinOps.

Cost Management Tools

Tool Purpose
Cloud Billing reports Analyze spend by project, service, label
Budgets & alerts Set spending thresholds with notifications
Recommender Right-sizing, idle resource, CUD recommendations
Labels Allocate costs to teams, environments, projects
Billing export (BigQuery) Custom cost analysis and dashboards
Pricing Calculator Estimate costs before deployment

Analyze Spend

  # List billing accounts
gcloud billing accounts list

# Describe project billing
gcloud billing projects describe learning-gcp-dev

# List all projects and their billing status
gcloud projects list --format="table(projectId,name)"
  

In Console: BillingReports → filter by project, service, or label.

BigQuery Billing Analysis

Enable billing export, then query:

  SELECT
  service.description AS service,
  sku.description AS sku,
  SUM(cost) AS total_cost,
  SUM(usage.amount) AS total_usage
FROM `billing_export.gcp_billing_export_v1_XXXXX`
WHERE usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY service, sku
ORDER BY total_cost DESC
LIMIT 20;
  
  -- Cost by label (team)
SELECT
  labels.value AS team,
  SUM(cost) AS total_cost
FROM `billing_export.gcp_billing_export_v1_XXXXX`,
  UNNEST(labels) AS labels
WHERE labels.key = 'team'
  AND usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY team
ORDER BY total_cost DESC;
  

Labeling Strategy

Label Example Values Purpose
environment dev, staging, prod Separate environment costs
team platform, data, frontend Team allocation
cost-center engineering, marketing Department billing
application web-app, api, pipeline App-level tracking
  gcloud compute instances add-labels web-server-01 \
  --labels=environment=prod,team=platform,application=web-app \
  --zone=us-central1-a

# Label at resource creation (preferred)
gcloud compute instances create web-server-02 \
  --labels=environment=prod,team=platform \
  --zone=us-central1-a \
  --machine-type=e2-medium \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud
  

Important: Labels applied after creation do not retroactively tag historical billing data.

Committed Use Discounts (CUDs)

Type Commitment Flexibility Discount
Resource-based CUD Specific vCPU/memory in a region Low (specific machine family) Up to 57% (3-year)
Spend-based CUD Hourly spend in a region Higher (any service) Up to 47% (3-year)
Sustained use Automatic, no commitment Full Up to 30%
  # View CUD recommendations
gcloud recommender recommendations list \
  --project=learning-gcp-dev \
  --location=global \
  --recommender=google.compute.commitment.UsageCommitmentRecommender
  

Purchase CUDs for steady baseline workloads; use on-demand for variable spikes.

Spot and Preemptible Savings

Workload Type On-Demand Cost Spot Cost Savings
Batch processing $100/month $9/month 91%
CI/CD runners $50/month $8/month 84%
Dev/test VMs $30/month $5/month 83%
  gcloud compute instances create batch-job \
  --provisioning-model=SPOT \
  --instance-termination-action=DELETE \
  --machine-type=n2-standard-4 \
  --zone=us-central1-a \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud
  

Budgets and Alerts

  gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="Monthly Dev Budget" \
  --budget-amount=500USD \
  --threshold-rule=percent=50 \
  --threshold-rule=percent=90 \
  --threshold-rule=percent=100 \
  --notifications-rule-pubsub-topic=projects/learning-gcp-dev/topics/billing-alerts
  

Set budgets per project and per team. Connect to Pub/Sub for automated responses (e.g., shut down dev VMs at 100%).

Cost Optimization Checklist

  • Delete idle resources (VMs, disks, IPs, snapshots, unused load balancers)
  • Right-size VMs using Recommender suggestions
  • Use preemptible/spot VMs for fault-tolerant batch jobs
  • Move infrequent data to Nearline, Coldline, or Archive storage
  • Enable autoscaling to match demand (MIGs, GKE HPA, Cloud Run)
  • Shut down dev/test environments outside business hours
  • Review billing reports monthly with engineering and finance
  • Purchase CUDs for predictable baseline compute
  • Use Autopilot GKE to avoid over-provisioned nodes
  • Set Cloud Run min-instances=0 for dev services

FinOps Practices

Phase Activities Participants
Inform Dashboards, cost reports, showback Finance + Engineering
Optimize Right-sizing, CUDs, architecture changes Engineering
Operate Budgets, policies, accountability, chargeback Finance + Leadership

Real-World Scenario: Reducing a $50K Monthly Bill

A startup’s GCP bill grew to $50K/month. Analysis revealed:

Finding Monthly Cost Action Savings
40 idle VMs (dev) $12K Auto-shutdown scheduler $12K
Over-provisioned Cloud SQL $8K Right-size db-custom-8 → db-custom-4 $4K
No storage lifecycle rules $5K Move to Nearline after 30 days $3K
On-demand baseline compute $15K 1-year CUD for steady workloads $6K
No autoscaling on GKE $10K Enable HPA + Cluster Autoscaler $4K

Total savings: ~$29K/month (58%) with no architecture changes — just governance and right-sizing.

Common Mistakes

Mistake Impact Fix
No labels on resources Cannot allocate costs Label at creation time
Ignoring idle resources Paying for unused VMs/disks Weekly Recommender review
CUD for variable workloads Paying for unused commitment CUD only for steady baseline
No budgets Surprise invoices Budgets on every project
Dev environments running 24/7 3x unnecessary compute cost Auto-shutdown outside hours

Best Practices

  • Export billing to BigQuery from day one
  • Review Recommender weekly for idle and overprovisioned resources
  • Use showback (not chargeback initially) to build cost awareness
  • Include cost estimates in architecture reviews and PR descriptions
  • Automate dev environment shutdown with Cloud Scheduler + Cloud Functions
  • Negotiate enterprise discounts when spend exceeds $100K/month
  • Track unit economics (cost per user, per request) not just total spend

Troubleshooting

Unexpected charges:

  # Check recent high-cost resources
# In BigQuery billing export:
# SELECT service.description, SUM(cost) FROM ... WHERE date = CURRENT_DATE() GROUP BY 1 ORDER BY 2 DESC
  

CUD not applying: Verify the committed resources match actual usage (machine family, region). Resource-based CUDs are inflexible.

Label not appearing in billing: Labels must be set at resource creation or before the billing period. Retroactive labels do not affect historical data.

Cloud cost optimization is a continuous practice shared between engineering and finance.

Next: CI/CD with Cloud Build — pipelines, triggers, and deployments.