Table of Contents
Cloud costs can spiral out of control without proper management. This guide covers practical strategies to reduce spending while maintaining performance and reliability.
Cost Visibility
You can't optimize what you can't measure. Start with comprehensive cost tracking:
Tagging Strategy
Tag all resources with:
Environment: production, staging, developmentTeam: ownership and accountabilityProject: cost allocationCostCenter: business unit tracking
aws ec2 run-instances \
--tag-specifications 'ResourceType=instance,Tags=[
{Key=Environment,Value=production},
{Key=Team,Value=platform},
{Key=Project,Value=api-backend}
]'
Instance Rightsizing
Most instances are oversized. Monitor actual utilization and resize accordingly.
Analysis Process
- Collect CPU and memory metrics for 14 days
- Identify instances with <50% average utilization
- Test smaller instance types in staging
- Gradually migrate production workloads
Example savings from downsizing:
# Before: r5.2xlarge (8 vCPU, 64GB RAM) = $0.504/hour
# Actual usage: 30% CPU, 20GB RAM
# After: r5.xlarge (4 vCPU, 32GB RAM) = $0.252/hour
# Savings: 50% = $185/month per instance
Storage Optimization
Storage costs add up, especially with snapshots and old data.
Lifecycle Policies
Implement automatic tiering:
- Move to infrequent access after 30 days
- Archive to Glacier after 90 days
- Delete old snapshots after 180 days
{
"Rules": [{
"Id": "archive-old-data",
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER" }
],
"Expiration": { "Days": 365 }
}]
}
Reserved Instances
For stable workloads, reserved instances offer significant savings:
- 1-year commitment: 30-40% discount
- 3-year commitment: 50-60% discount
When to Use
Reserve capacity for:
- Databases running 24/7
- Baseline compute capacity
- Predictable batch workloads
Use spot instances for:
- Fault-tolerant processing
- Development environments
- CI/CD runners
Automated Scaling
Scale resources based on actual demand:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Schedule-Based Scaling
Reduce resources during off-hours:
# Scale down at night
0 22 * * * kubectl scale deployment api --replicas=2
# Scale up in morning
0 7 * * * kubectl scale deployment api --replicas=10
Summary
Cloud cost optimization is ongoing work. Start with visibility through tagging, rightsize instances based on actual usage, optimize storage with lifecycle policies, use reserved instances for baseline capacity, and implement autoscaling. Review costs monthly and adjust strategies based on workload changes.