I've seen startups burning 15k. The waste is usually hiding in plain sight - you just need to know where to look.
Where The Money Goes
1. Right-Size Your EC2 Instances
Most instances are oversized. Check actual utilization:
# Get CPU utilization for last 2 weeks
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-1234567890 \
--start-time $(date -v-14d +%Y-%m-%dT%H:%M:%S) \
--end-time $(date +%Y-%m-%dT%H:%M:%S) \
--period 3600 \
--statistics Average
If average CPU is under 20%, you can probably downsize.
2. Use Spot Instances for Non-Critical Workloads
Spot instances are 60-90% cheaper. Use them for:
- Dev/staging environments
- Batch processing
- CI/CD runners
- Stateless workers
// AWS CDK example
const spotFleet = new ec2.CfnSpotFleet(this, 'SpotFleet', {
spotFleetRequestConfigData: {
iamFleetRole: fleetRole.roleArn,
targetCapacity: 10,
launchSpecifications: [{
instanceType: 't3.medium',
imageId: ami.imageId,
spotPrice: '0.02', // Max you're willing to pay
}],
},
});
3. Reserved Instances & Savings Plans
If you know youll use EC2 for a year, commit to it:
Savings Plans are more flexible than Reserved Instances - they apply across instance families.
4. NAT Gateway Costs Are Sneaky
NAT Gateways charge $0.045 per GB. If your Lambda functions are hitting external APIs, that adds up fast.
Solutions:
- Does Lambda really need to be in VPC?
- Use VPC Endpoints for AWS services (S3, DynamoDB, etc.)
- Consolidate outbound traffic
// VPC Endpoint for S3 - no NAT needed
const s3Endpoint = vpc.addGatewayEndpoint('S3Endpoint', {
service: ec2.GatewayVpcEndpointAwsService.S3,
});
5. S3 Lifecycle Policies
Data just sits there forever, costing money:
{
"Rules": [
{
"ID": "Move to IA after 30 days",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}
6. Clean Up Orphaned Resources
Resources that get forgotten:
# Find unattached EBS volumes
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[*].[VolumeId,Size,CreateTime]'
# Find unused Elastic IPs
aws ec2 describe-addresses \
--query 'Addresses[?AssociationId==`null`].[PublicIp,AllocationId]'
# Old snapshots
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[?StartTime<`2024-01-01`].[SnapshotId,StartTime,VolumeSize]'
7. RDS Optimization
- Dev/staging: Single-AZ, smaller instance, stop outside work hours
- Production: Consider Aurora Serverless for variable workloads
# Stop RDS instance (saves money when not in use)
aws rds stop-db-instance --db-instance-identifier dev-database
# Automatically stops after 7 days, need to restart
Quick Wins Checklist
- [ ] Enable Cost Explorer and set budget alerts
- [ ] Right-size EC2 instances (check CloudWatch)
- [ ] Use Spot for dev/staging/batch
- [ ] Buy Savings Plans for predictable workloads
- [ ] Add S3 lifecycle policies
- [ ] Use VPC Endpoints instead of NAT for AWS services
- [ ] Delete orphaned EBS volumes and old snapshots
- [ ] Stop dev/staging databases outside work hours
- [ ] Enable S3 Intelligent-Tiering for uncertain access patterns
Tools That Help
- AWS Cost Explorer - Built-in, see where money goes
- AWS Compute Optimizer - Right-sizing recommendations
- Infracost - See cost impact in PR reviews
- Spot.io - Automated spot instance management
Further Reading
- AWS Well-Architected Cost Optimization
- AWS Pricing Calculator
- Last Week in AWS Newsletter - Great for staying current
AWS cost optimization isnt a one-time thing. Set up alerts, review monthly, and question every resource. That 15k with some attention.
