Cloud Cost Optimization: The $100K You're Probably Wasting
30–40% of enterprise cloud spend is waste. Here are the seven strategies that consistently cut bills by $50–200K annually without sacrificing performance or reliability.
The Overspend Problem
Gartner estimates that global cloud spending will reach $723 billion in 2026. Flexera's annual State of the Cloud report consistently finds that organizations self-report wasting 30–35% of their cloud budget — and the actual number is likely higher because many teams can't accurately attribute costs to services.
We've audited cloud environments for 40+ clients across AWS, Azure, and GCP. The average waste we find on the first pass is $8,400/month — or just over $100K annually. For enterprise clients, it's often $300K+. The kicker? Most of these savings require zero architectural changes. They're configuration optimizations that can be implemented in days.
Strategy 1: Kill the Zombies
Every cloud environment has them — zombie resources that were spun up for testing, a demo, or a one-off analysis and never torn down. They're the #1 source of waste and the easiest to fix.
What to look for:
- Unattached EBS volumes / Azure Managed Disks — Created when an instance was deleted but the volume persisted. Often $0.10/GB/month × hundreds of GB.
- Idle load balancers — ALBs/NLBs with zero targets or zero request count. Each costs ~$16–22/month minimum.
- Orphaned Elastic IPs / Static IPs — AWS charges $3.65/month for unattached Elastic IPs. It adds up.
- Snapshots from deleted instances — Old AMI snapshots accumulating at $0.05/GB/month.
- Dev/staging environments running 24/7 — If nobody's in the office at 2 AM, why is the staging cluster awake?
Run a single query against your cloud provider's cost explorer filtered by resources with zero network I/O for 30+ days. We've found $2,000–$8,000/month in zombies on the first scan for nearly every client.
Strategy 2: Right-Sizing Instances
Most teams pick an instance size based on peak load estimates — then never revisit the decision. The result: instances running at 5–15% average CPU utilization, with memory even more idle.
The Right-Sizing Process:
- Collect metrics — 14 days minimum of CPU, memory, network, and disk I/O data
- Identify the P95 — Size for the 95th percentile, not the peak. Let auto-scaling handle bursts.
- Drop one size — An m5.xlarge running at 12% CPU should be an m5.large. That's ~50% savings on that instance.
- Consider Graviton / ARM — AWS Graviton3 instances (t4g, m7g) are 20% cheaper and 25% faster for most workloads
- Automate — AWS Compute Optimizer, Azure Advisor, and GCP Recommender all provide right-sizing suggestions. Trust them.
| Instance Type | On-Demand ($/hr) | Right-Sized Alt | Alt Cost ($/hr) | Annual Savings |
|---|---|---|---|---|
| m5.2xlarge | $0.384 | m7g.xlarge | $0.153 | $2,024 |
| r5.xlarge | $0.252 | r7g.large | $0.100 | $1,332 |
| c5.4xlarge | $0.680 | c7g.2xlarge | $0.271 | $3,583 |
| t3.xlarge | $0.166 | t4g.large | $0.067 | $867 |
Strategy 3: Reserved Instances & Savings Plans
If you know a workload will run for 12+ months, paying on-demand is lighting money on fire. Reserved Instances (AWS/Azure) and Committed Use Discounts (GCP) offer 30–72% savings for 1–3 year commitments.
The Decision Matrix:
- 1-year, no upfront — 30–40% savings, no cash outlay, easy to start
- 1-year, all upfront — 40–50% savings, best for predictable workloads
- 3-year, all upfront — 60–72% savings, only for mature, stable workloads
- Savings Plans (AWS) — Flexibility to change instance families. Slightly less discount than specific RIs but much more forgiving.
Don't reserve everything day one. Start with 50% coverage using Savings Plans for your baseline, then add specific RIs for workloads that haven't changed in 6+ months. This protects you from over-committing before you understand your true steady-state usage.
Strategy 4: Spot & Preemptible Instances
Spot instances (AWS), Spot VMs (Azure), and Preemptible VMs (GCP) offer 60–90% discounts on compute in exchange for the provider being able to reclaim capacity with 2 minutes' notice.
Ideal workloads for spot:
- CI/CD pipelines — Build failures from interruption are a retry, not a disaster
- Batch processing / ETL — Jobs that can checkpoint and resume
- ML training — Frameworks like SageMaker and Vertex AI handle spot interruptions natively
- Stateless web workers — Behind a load balancer with auto-scaling, an interrupted instance is seamlessly replaced
- Data analysis & rendering — Embarrassingly parallel workloads that can distribute across many small instances
The key is designing for interruption. If your application can't tolerate a random restart, spot isn't for you. But if it can, you're leaving 70%+ savings on the table.
Strategy 5: Storage Tiering
Storage is the silent budget killer. It grows monotonically — teams add data but rarely delete it — and costs compound monthly.
The Tiering Strategy:
| Storage Tier | Use Case | S3 Pricing ($/GB/mo) | vs Standard |
|---|---|---|---|
| S3 Standard | Active data, <30 days old | $0.023 | Baseline |
| S3 IA | Accessed <1x/month | $0.0125 | -46% |
| S3 Glacier Instant | Quarterly access, ms retrieval | $0.004 | -83% |
| S3 Glacier Deep | Archive, 12-hour retrieval | $0.00099 | -96% |
Enable S3 Intelligent-Tiering (AWS), Cool/Archive access tiers (Azure), or Nearline/Coldline (GCP) to automatically move data to cheaper tiers based on access patterns. The monitoring fee (~$0.0025/1K objects) pays for itself almost immediately.
Strategy 6: Taming Data Egress
Cloud providers charge near-zero for data ingress but $0.05–$0.12/GB for egress. If you're moving significant data between regions, between providers, or to the internet, egress can become your largest line item.
Mitigation strategies:
- Keep compute near data — Don't run analytics in us-east-1 against data stored in eu-west-1
- Use CDNs aggressively — CloudFront/Cloud CDN egress is 40–60% cheaper than direct EC2 egress
- Compress everything — gzip/brotli on API responses, parquet for analytics data
- VPC Endpoints / Private Link — Inter-service communication via private endpoints avoids egress charges entirely
- Negotiate — Above $50K/month in egress, all three providers will negotiate custom pricing
Strategy 7: Building a FinOps Culture
The most impactful optimization isn't technical — it's organizational. FinOps (Cloud Financial Operations) is the practice of bringing financial accountability to cloud spending.
FinOps Maturity Levels:
- Crawl — Basic tagging and cost allocation. Every resource has an owner tag and a project tag. Monthly cost review meetings.
- Walk — Team-level budgets and alerts. Showback reports that let teams see their spending. Automated right-sizing recommendations.
- Run — Real-time cost anomaly detection. Unit economics (cost per transaction, cost per user). Engineering KPIs include cost efficiency metrics.
We've never seen a client with 100% tag compliance. The average is closer to 40–60%. Without tags, you can't allocate costs to teams, which means nobody is accountable for spending. Rule: untagged resources get auto-terminated after 72 hours. Harsh, but it works.
The Quick-Win Checklist
Here's a prioritized list you can execute this week:
- 🔴 Day 1: Delete unattached volumes, unused Elastic IPs, idle load balancers (typical savings: $500–2,000/mo)
- 🔴 Day 1: Schedule dev/staging environments to auto-stop at 7 PM and auto-start at 8 AM (typical: $300–1,500/mo)
- 🟡 Day 2: Enable S3 Intelligent-Tiering on all non-critical buckets (typical: $200–800/mo)
- 🟡 Day 3: Right-size your top 10 most expensive instances (typical: $1,000–4,000/mo)
- 🟢 Week 2: Purchase Savings Plans for your baseline compute (typical: $2,000–8,000/mo)
- 🟢 Week 3: Move CI/CD to spot instances (typical: $500–2,000/mo)
- 🔵 Month 2: Implement tagging policy and team-level showback (prevents future waste)
Want a Free Cloud Cost Audit?
We'll analyze your AWS/Azure/GCP bill and deliver a prioritized list of savings — typically finding $50–200K in annual waste within 3 business days.