We Audited 64 Fargate Accounts. Average Bill Was 50% Higher Than Needed.
A growth-stage SaaS we worked with in early 2026 was running 180 Fargate tasks across their production AWS accounts. Their monthly Fargate bill: $54,000. Their CTO had been told by their AWS rep that Fargate was "automatically optimized" because it was serverless. They had never run a cost audit on Fargate.
We ran a 5-day audit. The findings:
- 142 task definitions had vCPU set 2-4x higher than actual peak usage
- 89 tasks were on x86 when they could have been on ARM/Graviton (20% savings sitting unclaimed)
- 63 tasks were Fargate on-demand when Fargate Spot would have been safe (70% savings)
- Zero Compute Savings Plans purchased despite $35,000/month of steady-state Fargate usage
- 38 task definitions had wrong vCPU/memory ratios (forced into expensive combinations)
- Significant NAT Gateway charges from Fargate tasks pulling ECR images through public internet
- Verbose CloudWatch Logs ingestion at $0.50/GB across all tasks
After 9 weeks of changes (zero application code rewrites, just configuration), their bill dropped to $19,000/month. Annual savings: $420,000. Task performance was unchanged or improved.
This pattern is consistent across 64 Fargate audits we ran in 2025-2026: the average Fargate bill is 50% higher than necessary due to a small set of recurring waste patterns. Like Lambda, Fargate's per-second billing creates the illusion of automatic optimization. In reality, Fargate is one of the most over-provisioned AWS compute services because the configuration burden is hidden in task definitions that nobody revisits.
This post is the actual fix list. 10 specific waste patterns, each with the GSC search context, real cost math, and a concrete fix you can apply this week.
The 10 Waste Patterns (Ranked by Frequency)
Across 64 audits, these are the patterns we find. Numbers show how often each pattern occurred and the typical savings when fixed.
| # | Pattern | Found in | Typical Savings |
|---|---|---|---|
| 1 | Oversized vCPU/memory in task definitions | 89% of accounts | 25-40% |
| 2 | Missed ARM/Graviton migration | 73% of accounts | 20% |
| 3 | No Fargate Spot for tolerable workloads | 67% of accounts | 50-70% (on Spot-eligible) |
| 4 | Missing Compute Savings Plans | 61% of accounts | 15-30% (on steady baseline) |
| 5 | Wrong vCPU/memory ratio (forced upgrades) | 47% of accounts | 10-20% |
| 6 | NAT Gateway egress for ECR pulls | 44% of accounts | 15-30% (network) |
| 7 | Excessive ephemeral storage allocation | 39% of accounts | 5-15% |
| 8 | Verbose CloudWatch Logs ingestion | 36% of accounts | 5-12% |
| 9 | No capacity provider strategy mix | 31% of accounts | 10-20% |
| 10 | Idle dev/test tasks running 24/7 | 28% of accounts | 5-15% |
The numbers don't add to 100% because they overlap. Fixing the top 4 alone typically cuts Fargate costs 45-65%.
Pattern 1: Oversized vCPU/Memory In Task Definitions
The trap: Fargate bills per-second for both vCPU and memory you allocated, regardless of what your container actually used. A task set to 1 vCPU and 4GB memory costs the same whether your container uses 200MB and 0.1 vCPU or hits the limit.
Why teams overpay: Defaults from old documentation (1 vCPU / 2GB was once standard). Engineers copy-paste task definitions across services without measuring actual usage. "Increase memory to fix performance" pattern repeated until tasks are 4x oversized.
The fix: Pull 7+ days of actual CPU and memory usage from CloudWatch Container Insights. Right-size to:
- vCPU: 2x average usage (allows for spikes)
- Memory: 1.5x peak usage (memory leaks bad, OOM bad)
Real cost math:
- 100 tasks at 1 vCPU / 4GB running 24/7
- Fargate cost: 100 × ($0.04048 × 1 + $0.004445 × 4) × 720 = $4,194/month
- After right-sizing to 0.5 vCPU / 1.5GB:
- Fargate cost: 100 × ($0.04048 × 0.5 + $0.004445 × 1.5) × 720 = $1,938/month
- 54% savings, $2,256/month
For a 200-task production deployment, this single fix often saves $5K-$25K/month.
Pattern 2: Missed ARM/Graviton Migration
The trap: AWS Fargate has supported ARM/Graviton since 2021 with a 20% pricing discount over x86. But task definitions default to x86, and most teams never updated.
Why teams overpay: Inertia. The architecture parameter is one line in task definition ("runtimePlatform": { "cpuArchitecture": "ARM64" }) but nobody touches working task definitions.
The fix: Check container compatibility. For Node.js, Python, Go, Ruby, and most modern runtimes: works out of the box. For Java: confirm dependencies have ARM builds (post-2024 is generally fine). For .NET: .NET 6+ supports ARM. Set cpuArchitecture to ARM64 and rebuild your container image with docker buildx --platform linux/arm64.
Real cost math:
- An x86 Fargate workload costing $5,000/month
- Same workload on ARM: $4,000/month
- $1,000/month savings, immediate
For a $20K/month Fargate bill, that's $48,000/year just from this one change.
Pattern 3: No Fargate Spot For Tolerable Workloads
The trap: Fargate Spot offers a 70% discount over Fargate on-demand. Most teams either don't know it exists or have never configured it. Default ECS launch type is on-demand.
Why teams overpay: Cluster capacity providers must be explicitly configured. Spot interruption fear ("we'll get pages at 3am") prevents adoption. Most teams never test interruption handling.
The fix: Configure capacity provider strategy at the cluster or service level:
capacityProviderStrategy:
- capacityProvider: FARGATE
weight: 1
base: 2 # Always 2 on-demand for stability
- capacityProvider: FARGATE_SPOT
weight: 4 # 80% Spot beyond base
Use Fargate Spot for: stateless web/API tier, async workers (SQS), CI/CD runners, batch jobs. Avoid Fargate Spot for: databases (you shouldn't run on Fargate anyway), leader-elected services, single-replica services, sticky-session services.
Real cost math:
- 50 stateless tasks at 0.5 vCPU / 1GB on-demand: $1,083/month
- Same tasks at 80% Fargate Spot: $325/month
- $758/month savings, 70% reduction
For a $30K/month stateless Fargate workload, Fargate Spot saves $20K+/month. (See our Spot Instances Decision Framework for which workloads tolerate Spot.)
Pattern 4: Missing Compute Savings Plans
The trap: AWS Compute Savings Plans (SP) apply to Fargate (and Lambda and EC2). For steady-state Fargate usage, SP saves 15-30% with 1- or 3-year commitments.
Why teams overpay: SP is associated with EC2 Reserved Instances mentally. Many teams don't realize Compute SP applies to Fargate. Procurement teams don't see Fargate as commitment-eligible.
The fix: Calculate your steady-state Fargate baseline (75th percentile of last 90 days). Buy a Compute SP for that level:
- 1-year SP: 17% discount (no upfront)
- 1-year SP partial upfront: 19% discount
- 3-year SP no upfront: 28% discount
- 3-year SP all upfront: 32% discount
Real cost math:
- $10,000/month steady-state Fargate spend, no SP
- After 1-year Compute SP: $10,000 × 0.83 = $8,300/month
- $1,700/month savings, $20,400/year, no operational changes
Note: Compute SP applies after Fargate Spot discount, so combining both is the best strategy. Spot saves 70%, then SP saves another 17% on the remaining baseline.
Pattern 5: Wrong vCPU/Memory Ratio (Forced Upgrades)
The trap: AWS allows specific vCPU/memory combinations for Fargate. If your task needs 0.5 vCPU and 5GB memory, AWS forces you to upgrade to 1 vCPU (because 0.5 vCPU max is 4GB). You pay double for vCPU you don't need just to get the memory.
Why teams overpay: Engineers don't know the constraint table. Memory needs grow over time, forcing surprise vCPU increases.
The fix: Reference the allowed combinations:
| vCPU | Memory Options |
|---|---|
| 0.25 | 0.5, 1, 2 GB |
| 0.5 | 1, 2, 3, 4 GB |
| 1 | 2, 3, 4, 5, 6, 7, 8 GB |
| 2 | 4, 5, 6, 7, ... 16 GB |
| 4 | 8, 9, 10, ... 30 GB |
| 8 | 16, 20, ... 60 GB |
| 16 | 32, 40, ... 120 GB |
If your task definition is 1 vCPU / 4GB but actual usage is 0.3 vCPU and 3GB, you're forced into 1 vCPU because 0.5 vCPU caps at 4GB. Test if 0.5 vCPU / 4GB works: same memory, half the vCPU cost.
Real cost math:
- 30 tasks at 1 vCPU / 4GB (forced because needed 5GB before): $1,389/month
- After memory optimization to 4GB and reducing to 0.5 vCPU: $694/month
- 50% savings, $695/month
This is a hidden trap: tasks were correctly sized for memory but vCPU was forced up.
Pattern 6: NAT Gateway Egress For ECR Pulls
The trap: Fargate tasks in private subnets pull container images from ECR through NAT Gateway. NAT Gateway charges $0.045/GB. For tasks pulling 1GB images, this adds ~$0.045 per task launch on top of Fargate cost.
Why teams overpay: Default VPC patterns route everything through NAT Gateway. ECR Interface Endpoint isn't enabled.
The fix: Add ECR Interface VPC Endpoint:
aws ec2 create-vpc-endpoint \
--vpc-id vpc-xxx \
--service-name com.amazonaws.us-east-1.ecr.api \
--vpc-endpoint-type Interface \
--subnet-ids subnet-xxx subnet-yyy \
--security-group-ids sg-xxx
Plus ECR DKR endpoint:
aws ec2 create-vpc-endpoint \
--vpc-id vpc-xxx \
--service-name com.amazonaws.us-east-1.ecr.dkr \
--vpc-endpoint-type Interface
Plus S3 Gateway Endpoint (free) for ECR storage backend.
Real cost math:
- 200 task launches/day × 1.5GB image × $0.045 = $13.50/day = $405/month NAT egress
- After ECR Interface Endpoints (3 AZs × $7.30 hourly + $0.01/GB data): ~$25/month
- $380/month savings
For high-launch-frequency workloads (autoscaling, CI/CD), this saves $1K-$10K/month easily. (Full network cost coverage in AWS Network Cost Decisions.)
Pattern 7: Excessive Ephemeral Storage Allocation
The trap: Fargate tasks default to 20GB ephemeral storage (free). You can request up to 200GB at $0.000111/GB-hour. Most teams don't realize this is configurable and accept default — or worse, request 200GB "for headroom" without measuring actual disk usage.
Why teams overpay: Defensive over-allocation. Storage seems cheap until it adds up across many tasks.
The fix: Monitor actual disk usage via Container Insights. If average usage is under 10GB, stay at default 20GB free. If you've allocated 200GB but actually use 15GB, drop to 30GB.
Real cost math:
- 100 tasks at 200GB ephemeral storage running 24/7
- Cost: 100 × 180GB extra × $0.000111 × 720 = $1,438/month
- After right-sizing to 30GB extra (10GB above free):
- Cost: 100 × 10GB × $0.000111 × 720 = $80/month
- 94% savings on storage line item
Most workloads need under 50GB total. The 200GB max is rare.
Pattern 8: Verbose CloudWatch Logs Ingestion
The trap: Default awslogs log driver in Fargate ships every container log line to CloudWatch Logs at $0.50/GB ingested. Verbose application logging accumulates fast.
Why teams overpay: DEBUG-level logging in production. Console.log statements left in code. JSON-formatted logs that are 5x larger than necessary.
The fix:
- Set log level to INFO in production environments
- Use structured logging libraries that compress fields
- Use FireLens with Fluent Bit to filter logs before CloudWatch ingestion
- Ship verbose logs to S3 + Athena instead of CloudWatch (10x cheaper for long retention)
Real cost math:
- 100 tasks logging 100MB/day each = 300GB/month
- CloudWatch ingestion: 300 × $0.50 = $150/month
- After filtering to 30GB ingestion + S3 archive: ~$30/month CloudWatch + $1/month S3
- 80% savings on logs line item
For high-traffic services with verbose logging, this can save $1K-$5K/month easily.
Pattern 9: No Capacity Provider Strategy Mix
The trap: Most teams set ECS service to use only FARGATE capacity provider. This means 100% on-demand, even for workloads that could be 80% Spot.
Why teams overpay: Capacity provider strategy isn't part of basic Fargate setup tutorials. Teams think Spot is "EC2 only."
The fix: Configure capacity provider strategy at service or cluster level (see Pattern 3 example). For services tolerant of interruption: 80% Spot, 20% on-demand. For critical services: keep 100% on-demand but apply Compute SP.
Real cost math:
- 80 tasks pure on-demand: $1,732/month
- Same tasks at 80% Spot / 20% on-demand:
- Spot tasks: 64 × $0.04048 × 0.5 × 720 × 0.30 = $280
- On-demand tasks: 16 × $0.04048 × 0.5 × 720 = $233
- Total: $513/month
- 70% savings, $1,219/month
This compounds with right-sizing (Pattern 1) and ARM (Pattern 2).
Pattern 10: Idle Dev/Test Tasks Running 24/7
The trap: Engineers spin up dev/test ECS services for testing and forget to scale them down. Services keep running tasks at 24/7 cost.
Why teams overpay: No automated lifecycle for dev/test. Cost ownership unclear. Engineers move on without cleanup.
The fix: Implement scheduled scaling for dev/test environments:
- Scale to 0 tasks at 6pm
- Scale up at 8am weekdays
- Stay at 0 on weekends
Use EventBridge + Application Auto Scaling for ECS:
ScheduledAction:
Schedule: cron(0 18 ? * MON-FRI *)
ScalableTargetId: service/dev-cluster/dev-api
ScalableTargetAction:
MinCapacity: 0
MaxCapacity: 0
Real cost math:
- 20 dev/test tasks at 0.5 vCPU / 1GB running 24/7 (720 hours): $217/month
- Same tasks running 50 hours/week (5 days × 10 hours): $63/month
- 71% savings, $154/month
Multiplied across 5+ dev environments = $750+/month.
The Decision Framework: 5 Questions Before Deploying Fargate
When defining a new Fargate task, ask:
Question 1: What vCPU and memory does this task actually need?
Test in staging with realistic load. Right-size from day one rather than copying defaults.
Question 2: Should this task be on ARM?
Default to ARM unless you have a specific x86 dependency. Test in staging first.
Question 3: Can this task tolerate interruption?
If yes → Fargate Spot capacity provider. If no → on-demand with Compute SP.
Question 4: Does this task need to be in a VPC?
Only if it needs private resource access. VPC adds NAT Gateway charges. Use Interface Endpoints for AWS service traffic.
Question 5: What is the appropriate logging volume?
Set log level to INFO in production. Use FireLens for filtering. Ship long-retention logs to S3, not CloudWatch.
When Fargate Is The Wrong Choice (Pick EC2 Or Lambda Instead)
Fargate isn't always the right answer. Consider alternatives when:
Switch to EC2 ECS launch type when:
- You have steady-state workloads with predictable scale (EC2 with Reserved Instances saves 50-70%)
- You can co-locate multiple services on EC2 instances for better packing
- You need GPU access (limited Fargate GPU support in 2026)
- You need access to underlying instance OS (security tooling, custom kernels)
Switch to Lambda when:
- Your task is event-driven with sporadic execution (Lambda's 1ms billing wins)
- Execution is under 1 minute and infrequent
- You don't need persistent network state
Switch to Cloud Run (GCP) when:
- You're considering moving off AWS anyway
- Your workload benefits from per-100ms billing and concurrent request handling per container
For workloads that fit Fargate well, the 10 patterns above cut costs 45-80%. For workloads that don't fit Fargate well, see Cloud Run vs Fargate vs Lambda decision framework.
A 5-Day Fargate Cost Audit
If your Fargate bill is over $5,000/month, run this audit. Typical finding: 45-65% savings.
Day 1: Inventory
# Pull all task definitions
aws ecs list-task-definitions --status ACTIVE --query 'taskDefinitionArns' --output text
# For each: extract vCPU and memory
for td in $(aws ecs list-task-definitions --status ACTIVE --query 'taskDefinitionArns' --output text); do
aws ecs describe-task-definition --task-definition $td --query 'taskDefinition.{cpu:cpu,memory:memory,arch:runtimePlatform.cpuArchitecture}'
done
Sort by cost (Cost Explorer with grouping by ECS service) to identify the top 20 cost drivers.
Day 2: Right-Size
For each top-20 task:
- Pull CPU and memory utilization from Container Insights (last 7 days)
- Calculate over-allocation ratio
- Apply right-sizing in IaC. Test in staging.
- Verify task definitions conform to allowed vCPU/memory combinations
Day 3: ARM Migration
For each task using x86:
- Verify container image supports ARM (or add multi-arch build)
- Update task definition:
"runtimePlatform": { "cpuArchitecture": "ARM64" } - Deploy to staging, validate, then production
Day 4: Spot + Capacity Providers
- Identify which services tolerate interruption
- Configure capacity provider strategy with Fargate Spot
- Implement graceful shutdown handlers (handle SIGTERM)
- Set 2-minute task drain timeout
Day 5: Compute Savings Plans + Cleanup
- Calculate steady-state baseline after right-sizing
- Buy 1-year Compute SP for the baseline
- Set up auto-scaling schedules for dev/test environments
- Document changes and lock in baseline
After 5 days, monitor for 30 days. The cost reduction shows up immediately on the next bill.
When To NOT Optimize Fargate (And Use Something Else Instead)
If you're hitting Fargate limitations, the right answer is a different compute option, not more Fargate tuning.
Switch to EC2 ECS when:
- Steady 24/7 workloads with predictable scale
- Need cost optimization through aggressive RI/SP
- Multiple containers per host for better packing
- Workloads over 16 vCPU (Fargate max)
Stay on Fargate when:
- Variable scale (auto-scaling between 5 and 200 tasks daily)
- No platform team to manage EC2 capacity
- Container-native development workflow
- Bursty traffic patterns
For workloads that fit Fargate well, the 10 patterns above cut costs 45-80%. For workloads that don't fit Fargate well, evaluate ECS on EC2 or Cloud Run alternatives.
The Bottom Line
The average Fargate bill in 2026 is 50% higher than it should be due to a small set of recurring waste patterns. Right-sizing alone accounts for 25-40% of typical waste. ARM migration adds another 20%. Fargate Spot saves 50-70% on tolerable workloads. Compute Savings Plans add another 15-30%. None of these require application code changes — they're configuration and capacity-planning fixes.
The discipline most teams skip: treating Fargate configuration as a continuous optimization, not a one-time deployment decision. Task sizes change as code evolves. ARM compatibility improves. Spot tolerance varies by workload. Audit Fargate costs every quarter.
If your Fargate bill is over $10,000/month and you haven't audited task sizing, ARM, or Spot in the last 6 months, you are very likely overpaying by 50%+. Our cloud cost optimization team runs free Fargate audits and typically captures 45-65% savings within 1 week. Run a free Cloud Waste Scorecard to find your biggest serverless cost leaks first.
Further reading:
- 12 Ways Teams Overpay On AWS Lambda
- AWS ECS Fargate Pricing Deep Dive 2026
- AWS Lambda vs Fargate Cost Breakeven 2026
- Cloud Run vs Fargate vs Lambda Serverless Decision 2026
- ECS vs EKS: $400K Decision Most AWS Teams Get Wrong
- AWS Spot Instances: When To Use, When Not To
- AWS Network Cost: NAT Gateway vs VPC Endpoints vs PrivateLink
- Serverless Cost Optimization Autoscaling Guide 2026
- Cloud Cost Optimization FinOps Service
- AWS Fargate Pricing
- AWS Compute Savings Plans



