to navigate

to select

to close

On this page

Cost Optimization

AWS’s pay-as-you-go model means costs scale with usage — both up and down. Without active management, cloud bills grow silently from idle resources, over-provisioned instances, and forgotten snapshots. This guide covers the tools and strategies professionals use to optimize AWS costs without sacrificing performance or reliability.

Understand Your Bill

AWS charges fall into major categories:

Category	Examples	Typical % of Bill
Compute	EC2, Lambda, Fargate	40-60%
Storage	S3, EBS, EFS	10-20%
Database	RDS, DynamoDB, ElastiCache	15-25%
Networking	Data transfer, NAT Gateway, CloudFront	5-15%
Other	CloudWatch, KMS, support plan	5-10%

  # Cost Explorer CLI (last 30 days by service)
aws ce get-cost-and-usage \
    --time-period Start=2024-05-01,End=2024-06-01 \
    --granularity MONTHLY \
    --metrics BlendedCost \
    --group-by Type=DIMENSION,Key=SERVICE

# Daily costs for EC2
aws ce get-cost-and-usage \
    --time-period Start=2024-06-01,End=2024-06-13 \
    --granularity DAILY \
    --metrics UnblendedCost \
    --filter '{"Dimensions":{"Key":"SERVICE","Values":["Amazon Elastic Compute Cloud - Compute"]}}'

Tagging Strategy

Tags enable cost allocation — without them, you can’t answer “which team spent what?”

Tag Key	Example Values	Purpose
Environment	dev, staging, production	Separate dev spend
Project	ecommerce, analytics	Per-project billing
Owner	team-platform, team-data	Team accountability
CostCenter	CC-1234	Finance integration

  # Enforce tagging with AWS Organizations SCP or Config rule
# Example: deny EC2 launch without required tags
{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Deny",
        "Action": "ec2:RunInstances",
        "Resource": "arn:aws:ec2:*:*:instance/*",
        "Condition": {
            "Null": {
                "aws:RequestTag/Environment": "true",
                "aws:RequestTag/Project": "true"
            }
        }
    }]
}

Activate Cost Allocation Tags in Billing Console → Cost Allocation Tags.

Right-Sizing

Most EC2 instances are over-provisioned. Use data, not guesses:

  # Compute Optimizer recommendations
aws compute-optimizer get-ec2-instance-recommendations

# CloudWatch CPU utilization over 14 days
aws cloudwatch get-metric-statistics \
    --namespace AWS/EC2 \
    --metric-name CPUUtilization \
    --dimensions Name=InstanceId,Value=i-xxx \
    --start-time 2024-05-30T00:00:00Z \
    --end-time 2024-06-13T00:00:00Z \
    --period 86400 \
    --statistics Average Maximum

CPU Avg	Action
< 20%	Downsize instance type
20-70%	Right-sized
> 70% sustained	Upsize or add instances

Also check Trusted Advisor (Business/Enterprise support) for underutilized EBS volumes, idle ELBs, and unused Elastic IPs.

Reserved Capacity vs Savings Plans

Option	Commitment	Flexibility	Savings
On-Demand	None	Full	0% (baseline)
Savings Plans (Compute)	$/hour for 1-3 years	Any instance family/region	Up to 66%
Reserved Instances (EC2)	Specific instance type	Low — tied to type/AZ	Up to 72%
Spot Instances	None (can be interrupted)	Any available capacity	Up to 90%

  # Purchase Compute Savings Plan (Console recommended for first time)
# Example: $0.50/hour commitment for 1 year, no upfront
# Applies to EC2, Fargate, Lambda automatically

Strategy: Steady-state baseline on Savings Plans; burst capacity on On-Demand; fault-tolerant workloads on Spot.

Spot Instances

Ideal for batch processing, CI/CD workers, and stateless workloads:

  # Launch Spot instance via ASG
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name batch-workers \
    --mixed-instances-policy '{
        "LaunchTemplate": {"LaunchTemplateSpecification": {"LaunchTemplateName": "batch-lt", "Version": "$Latest"}},
        "InstancesDistribution": {
            "OnDemandBaseCapacity": 0,
            "OnDemandPercentageAboveBaseCapacity": 0,
            "SpotAllocationStrategy": "capacity-optimized"
        }
    }' \
    --min-size 0 --max-size 20 --desired-capacity 5

Handle Spot interruptions gracefully — use Spot Instance interruption notices (2-minute warning via IMDS).

Storage Cost Optimization

S3 Lifecycle Policies

  {
    "Rules": [{
        "ID": "TieredStorage",
        "Status": "Enabled",
        "Filter": {"Prefix": "data/"},
        "Transitions": [
            {"Days": 30, "StorageClass": "STANDARD_IA"},
            {"Days": 90, "StorageClass": "GLACIER"},
            {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
        ]
    }]
}

EBS Optimization

  # Find unattached EBS volumes (paying for storage with no instance)
aws ec2 describe-volumes \
    --filters Name=status,Values=available \
    --query 'Volumes[*].[VolumeId,Size,CreateTime]' \
    --output table

# Delete unused volumes (verify first!)
aws ec2 delete-volume --volume-id vol-xxx

Switch gp2 → gp3 for 20% cost savings with same or better performance.

NAT Gateway Costs

NAT Gateway is often a surprise line item (~$32/month + $0.045/GB processed per AZ):

Alternative	Savings	Trade-off
VPC endpoints for S3/DynamoDB	Free (gateway)	S3/DynamoDB only
Interface endpoints for AWS APIs	~$7/month/AZ	Per-service cost
NAT Instance (t3.micro)	~$8/month	You manage HA
VPC endpoints + no NAT	Maximum	Limited to AWS services

Audit NAT Gateway data processing charges monthly.

Budgets and Alerts

  # Create budget via CLI
aws budgets create-budget \
    --account-id 123456789012 \
    --budget '{
        "BudgetName": "monthly-total",
        "BudgetLimit": {"Amount": "500", "Unit": "USD"},
        "TimeUnit": "MONTHLY",
        "BudgetType": "COST"
    }' \
    --notifications-with-subscribers '[{
        "Notification": {
            "NotificationType": "ACTUAL",
            "ComparisonOperator": "GREATER_THAN",
            "Threshold": 80,
            "ThresholdType": "PERCENTAGE"
        },
        "Subscribers": [{"SubscriptionType": "EMAIL", "Address": "[email protected]"}]
    }]'

Set budgets per team/project using cost allocation tags.

Real-World Scenario: Startup Cost Review

Finding	Monthly Cost	Action	Savings
3 idle t3.large (dev)	$180	Stop after hours / use t3.micro	$150
Unattached 500 GB EBS	$50	Delete after snapshot	$50
NAT Gateway (single AZ dev)	$45	VPC endpoints for S3/APIs	$30
RDS db.r5.xlarge (20% CPU)	$350	Downsize to db.r5.large	$175
S3 Standard for 2TB logs	$46	Lifecycle to Glacier after 30d	$35
Total			~$440/month

FinOps Best Practices

Monthly cost review — dedicated meeting with engineering and finance
Showback/chargeback — teams see their own cloud costs
Automate shutdown — dev/staging resources off nights and weekends
Use AWS Free Tier wisely for experiments, not production
Review Reserved/Savings Plan utilization quarterly
Delete unused resources — EIPs, snapshots, AMIs, old log groups

Common Cost Mistakes

Leaving dev environments running 24/7 — schedule stop/start
Over-provisioned RDS — db.r5 for a dev database with 5 connections
No lifecycle policies on S3 — logs accumulate in Standard class forever
Multiple NAT Gateways in dev — one is enough for non-production
Ignoring data transfer costs — cross-AZ and cross-region transfer adds up
Unused Reserved Instances — buy RIs only for proven steady-state workloads

Troubleshooting Unexpected Bills

Spike Source	How to Find	Fix
EC2	Cost Explorer → EC2 → by instance ID	Stop/terminate idle instances
Data transfer	Cost Explorer → Data Transfer	CloudFront for outbound; same-AZ placement
NAT Gateway	VPC → NAT Gateways → monitoring	VPC endpoints; reduce cross-AZ traffic
CloudWatch Logs	Log groups → stored bytes	Set retention; reduce log verbosity
S3 requests	S3 → Metrics → NumberOfObjects	Lifecycle policies; Intelligent-Tiering

Best Practices Summary

Tag everything from day one — retroactive tagging is painful
Use Cost Explorer and Cost Anomaly Detection weekly
Purchase Savings Plans for steady-state compute after 3 months of stable usage
Use Spot for fault-tolerant and batch workloads
Apply S3 lifecycle policies to every bucket
Enable AWS Budgets with alerts at 50%, 80%, 100%
Run Trusted Advisor or Compute Optimizer monthly
Automate dev environment shutdown with Instance Scheduler or Lambda

Next: DevOps with CodePipeline.

Well-Architected Framework

DevOps with CodePipeline

Cost Optimization

Understand Your Bill link

Tagging Strategy link

Right-Sizing link

Reserved Capacity vs Savings Plans link

Spot Instances link

Storage Cost Optimization link

S3 Lifecycle Policies link

EBS Optimization link

NAT Gateway Costs link

Budgets and Alerts link

Real-World Scenario: Startup Cost Review link

FinOps Best Practices link

Common Cost Mistakes link

Troubleshooting Unexpected Bills link

Best Practices Summary link