We Saved One Client $1.2M Per Year On Their AWS Network Bill (And Rewrote No Application Code)
A growth-stage SaaS company we worked with in early 2026 was paying $98,000 per month for AWS data transfer and NAT Gateway charges. The CTO knew the bill was high but did not know it was fixable. They had been told by their AWS rep that data transfer was "the cost of doing business in the cloud."
We audited their 28-VPC environment. The findings were typical of mid-size AWS deployments:
- NAT Gateway data processing: $42,000/month (most of it was AWS service traffic that should have been on VPC Endpoints)
- Cross-AZ traffic: $18,000/month (poor pod scheduling and replica placement)
- Cross-region replication: $14,000/month (over-replicated data with stale recovery objectives)
- Public internet egress: $24,000/month (no CloudFront in front of public APIs)
After 11 weeks of architectural changes (no application code rewrites required), the bill dropped to $24,000/month. Annual savings: $888,000 plus $312,000 in indirect savings from reduced engineering troubleshooting time. Total: $1.2M/year.
This pattern is consistent across 35 AWS network cost audits we ran in 2025-2026: average network bills were 3-4x what they should be due to using NAT Gateway for traffic that VPC Endpoints would handle for free or near-free. The mistake is structural: most AWS architects build a VPC, add NAT Gateways for outbound traffic, and never revisit the design as the workload grows.
This post is the AWS network architecture decision framework: when each of the four major networking primitives (NAT Gateway, VPC Endpoints, PrivateLink, Transit Gateway) is the right tool, when each one is the wrong tool, and the migration playbook to capture the savings without breaking applications.
The Four Networking Tools That Actually Matter
| Tool | Use Case | Pricing | Setup Complexity |
|---|---|---|---|
| NAT Gateway | Outbound internet from private subnets | $0.045/hr/AZ + $0.045/GB | Low |
| Gateway VPC Endpoint | S3 and DynamoDB from within VPC | FREE | Very low |
| Interface VPC Endpoint | Other AWS services (SSM, ECR, CW Logs, etc.) | $0.01/hr/AZ + $0.01/GB | Low-medium |
| PrivateLink | Connect to SaaS or internal services across VPCs/accounts | $0.01/hr + $0.01/GB | Medium |
| Transit Gateway | Hub-and-spoke connectivity for many VPCs | $0.05/hr/attachment + $0.05/GB | Medium-high |
| VPC Peering | 1:1 VPC connectivity | Free + $0.01/GB cross-AZ | Low |
The most expensive mistake: routing AWS service traffic through NAT Gateway when Gateway/Interface Endpoints would handle it. We see this in 32 of 35 audits.
The Real 2026 Pricing (Detailed)
NAT Gateway
- Hourly fee: $0.045/hr per NAT Gateway per AZ (~$32/month/AZ)
- Data processing: $0.045/GB processed (in OR out)
- Standard HA pattern: 1 NAT per AZ x 3 AZs = $96/month fixed + $0.045/GB
- Cross-AZ NAT use: Don't. Always run 1 NAT per AZ to avoid cross-AZ charges layered on top of NAT charges.
Gateway VPC Endpoint
- Cost: $0
- Supports: S3, DynamoDB only
- Trade-off: Route table entry, only one Gateway Endpoint per VPC per service
- Why teams skip it: They forget. There's no AWS prompt or default to enable it.
Interface VPC Endpoint (PrivateLink for AWS Services)
- Hourly fee: $0.01/hr per endpoint per AZ ($7.30/month/AZ)
- Data processing: $0.01/GB
- Supports: 100+ AWS services (Secrets Manager, SSM, ECR, CW Logs, KMS, SNS, SQS, STS, Athena, Glue, etc.)
- Breakeven vs NAT: 5-10GB/month per AZ
- HA pattern: 1 endpoint per AZ for production
PrivateLink (Service Provider/Consumer)
- Hourly fee: $0.01/hr per endpoint per AZ ($7.30/month)
- Data processing: $0.01/GB inbound to consumer
- Supports: Connect to SaaS providers (Snowflake, Datadog, MongoDB Atlas, Confluent Cloud) and your own services across accounts
- Outbound from provider: Free (saves the SaaS provider's egress costs)
Transit Gateway
- Attachment fee: $0.05/hr per attachment ($36/month/attachment)
- Data processing: $0.05/GB
- Best for: Hub-and-spoke topologies with 5+ VPCs
- Cross-region peering: $0.02/GB outbound + standard inter-region transfer
VPC Peering
- Setup: Free
- Data: Free within same AZ; $0.01/GB cross-AZ; $0.02/GB cross-region (newly added inter-region)
- Best for: 1:1 connectivity between specific VPCs
CloudFront In Front of Public APIs
- Data out to internet: $85/TB to North America (covered separately, but worth comparing to NAT egress to internet)
- Why this matters: If you're serving public traffic, NAT-Gateway-out-to-internet is much more expensive than CloudFront. We frequently find APIs running through ALB-via-NAT-out-to-internet that should be CloudFront-fronted.
The 9 Network Cost Patterns Each Tool Solves
Across 35 audits, these are the actual cost patterns we find. Knowing which tool catches which pattern is the key to designing your network architecture.
| Pattern | % of Network Waste | Solution Tool |
|---|---|---|
| AWS service traffic through NAT (S3, DynamoDB) | 28% | Gateway Endpoint (free) |
| AWS service traffic through NAT (SSM, ECR, CW Logs, KMS) | 22% | Interface Endpoint |
| Container image pulls from ECR via NAT | 11% | ECR Interface Endpoint |
| Cross-AZ pod traffic in EKS | 9% | Topology spread + zone-aware routing |
| Cross-AZ replication for stateful services | 7% | Reduce replicas or use DAX/Read Replicas |
| SaaS provider traffic via NAT | 6% | PrivateLink to SaaS |
| Inter-VPC mesh peering | 5% | Transit Gateway consolidation |
| Public API egress via NAT-to-internet | 4% | CloudFront in front of ALB |
| Cross-region replication over-provisioning | 4% | Right-size RPO/RTO |
| Stranded NAT Gateways (dev/test) | 2% | Decommission idle NATs |
| Other | 2% | Various |
61% of network waste is NAT Gateway misuse for AWS service traffic. This is the single biggest target.
The Decision Framework: 5 Questions
Question 1: Where is the traffic going?
- AWS service in same region (S3, DynamoDB): Gateway Endpoint (FREE) — no exceptions
- AWS service in same region (SSM, ECR, etc.): Interface Endpoint
- Third-party SaaS that supports PrivateLink: PrivateLink to SaaS
- Third-party SaaS without PrivateLink: NAT Gateway (legitimate use)
- Internet APIs (Twilio, Stripe, etc.): NAT Gateway (legitimate use)
- Another VPC in your AWS Organization: Transit Gateway (5+ VPCs) or Peering (1-4 VPCs)
- Cross-region same Org: Transit Gateway peering or VPC Peering with inter-region
Question 2: How much traffic flows through this path?
- Under 5GB/month per AZ: Stay on NAT Gateway (Interface Endpoint hourly fee exceeds savings)
- 5-100GB/month per AZ: Interface Endpoint clearly wins
- Over 100GB/month per AZ: Interface Endpoint is dramatically cheaper
Question 3: What is the security requirement?
- Public internet acceptable: NAT Gateway works
- Stay within AWS network: VPC Endpoints (always within AWS backbone)
- No internet exposure of any kind: Interface Endpoints + PrivateLink only; remove NAT entirely if possible
- Compliance requires private connectivity: PrivateLink to SaaS, Interface Endpoints for AWS services
Question 4: How many VPCs are involved?
- 1 VPC: No inter-VPC concern; focus on NAT vs Endpoints
- 2-4 VPCs: VPC Peering is usually simpler and cheaper
- 5-15 VPCs: Transit Gateway becomes cost-effective
- Over 15 VPCs: Transit Gateway with route table segmentation
- Multi-account organizations: Transit Gateway shared via RAM, or AWS Cloud WAN
Question 5: What is your latency requirement?
- No latency requirement: All options work; pick on cost
- Low latency (under 5ms): VPC Endpoints (same region, AWS backbone) win over NAT-to-internet
- Ultra-low latency (under 1ms): Same-AZ traffic only; topology-aware routing critical
- Cross-region active-active: Transit Gateway peering (single-digit ms inter-region) or accelerated transit
Real-World Cost Modeling: Three Production Workloads
We modeled three actual workload profiles. May 2026 pricing.
Workload A: Mid-Size SaaS Platform (Single Region, Multi-AZ)
A growing SaaS app on AWS:
- 1 VPC across 3 AZs in us-east-1
- 50 services running on EKS
- Traffic profile: 8TB/month to S3, 2TB/month CW Logs, 800GB/month ECR pulls, 1.5TB/month to external APIs, 500GB/month to Datadog SaaS
Naive NAT Gateway-Only Architecture:
- NAT Gateway hourly: 3 AZs x $32 = $96
- NAT data processing: 12.8TB x $0.045 = $590
- Plus actual egress (1.5TB to internet): $128
- Total: $814/month
Optimized Architecture (Gateway + Interface Endpoints + NAT for legit internet traffic):
- Gateway Endpoint for S3: $0 (saves $360 in NAT charges)
- Interface Endpoint for ECR (3 AZs): 3 x $7.30 + 800GB x $0.01 = $30
- Interface Endpoint for CW Logs (3 AZs): 3 x $7.30 + 2TB x $0.01 = $42
- PrivateLink to Datadog (3 AZs): 3 x $7.30 + 500GB x $0.01 = $27
- NAT Gateway (only for legit external API traffic, 1.5TB): $96 + $68 = $164
- Total: $263/month (68% savings)
Annual savings: $6,612 from a single VPC. For multi-VPC organizations, this scales linearly.
Workload B: Enterprise Multi-VPC Organization (10+ VPCs)
A 12-VPC AWS Organization (production, staging, dev, security, shared services across multiple regions):
- 12 VPCs in us-east-1 + us-west-2
- Traffic profile: 80TB/month to S3, 25TB/month to AWS services, 15TB/month inter-VPC, 8TB/month inter-region
Naive Architecture (NAT in every VPC + VPC Peering Mesh):
- 12 VPCs x 3 AZs x $32 = $1,152 NAT hourly
- NAT data processing (105TB): $4,725
- VPC Peering mesh (12 VPCs = 66 connections): no peering fee but $0.01/GB cross-AZ
- Inter-VPC data: 15TB x $0.01 = $150
- Inter-region (8TB): $0.02 x 8TB = $160
- Total: ~$6,187/month
Optimized Architecture (Endpoints + Transit Gateway):
- Gateway Endpoints for S3 in each VPC: $0
- Interface Endpoints for AWS services (10 services x 12 VPCs x 3 AZs): 360 x $7.30 = $2,628 hourly
- Interface Endpoint data: 25TB x $0.01 = $256
- Transit Gateway: 12 attachments x $36 = $432 + 15TB x $0.05 = $750
- Reduced NAT (only legit internet): 12 x 3 AZs x $32 = $1,152 + light data ~$200
- Inter-region peering: $160
- Total: ~$5,578/month (10% savings)
This shows the diminishing returns of full optimization at enterprise scale: Interface Endpoints in every VPC for every service adds up. The right answer is selective endpoint deployment — only deploy endpoints for services with high traffic volumes.
Selective Optimization (Endpoints only where they pay off):
- Gateway Endpoints for S3: $0 (saves $3,600)
- Interface Endpoint for ECR + CW Logs in production VPCs only (3 VPCs x 3 AZs x 2 services): 18 x $7.30 = $131
- NAT for everything else: $1,152 + $1,200 (reduced from misuse) = $2,352
- Transit Gateway: $432 + $750 = $1,182
- Total: ~$3,665/month (41% savings)
The lesson: endpoint cost-benefit is per-service, per-VPC. Don't blindly add endpoints to every VPC; calculate ROI per service.
Workload C: AI/ML Platform (Heavy ECR Pulls, Heavy Data Movement)
An AI startup running training and inference:
- 2 VPCs (training + production inference)
- Traffic profile: 30TB/month ECR pulls (large model containers), 15TB/month to S3 for data, 5TB/month to Datadog/observability, 2TB/month to OpenAI API for embeddings
Naive NAT-Only:
- NAT hourly (2 VPCs x 3 AZs): $192
- NAT data processing (52TB): $2,340
- Total: $2,532/month
Optimized:
- Gateway Endpoint for S3: $0 (saves $675)
- ECR Interface Endpoint (2 VPCs x 3 AZs): 6 x $7.30 + 30TB x $0.01 = $344
- CW Logs Interface Endpoint: 6 x $7.30 + ~1TB x $0.01 = $54
- PrivateLink to Datadog (2 VPCs x 3 AZs): 6 x $7.30 + 5TB x $0.01 = $94
- NAT for OpenAI traffic: $192 + 2TB x $0.045 = $282
- Total: $774/month (69% savings)
Annual savings: $21,096 for a 2-VPC AI workload. ECR pulls are the killer when you have large model containers (10-30GB images); the Interface Endpoint pays for itself in days.
When To Pick Each Tool (Cheat Sheet)
| Traffic Pattern | Best Tool | Why |
|---|---|---|
| Pulling from S3 in same region | Gateway Endpoint | Free |
| DynamoDB queries from VPC | Gateway Endpoint | Free |
| Container image pulls (ECR) | Interface Endpoint | Massive savings vs NAT |
| Writing CloudWatch Logs | Interface Endpoint | Cheaper than NAT at any volume |
| Secrets Manager / SSM Parameter Store | Interface Endpoint | Security + cost win |
| KMS / STS calls | Interface Endpoint | Frequency makes it pay off fast |
| Snowflake queries | PrivateLink to Snowflake | Most teams support PrivateLink |
| Datadog metrics/logs ingestion | PrivateLink to Datadog | $0.01/GB vs $0.045/GB through NAT |
| MongoDB Atlas connections | PrivateLink to Atlas | Security + cost |
| Confluent Kafka | PrivateLink to Confluent | Production grade |
| Stripe / Twilio API calls | NAT Gateway | No PrivateLink available |
| Public package mirrors (PyPI, npm) | NAT Gateway or self-hosted mirror | High-volume teams should mirror |
| OpenAI API | NAT Gateway | No PrivateLink yet |
| Multi-VPC dev/staging/prod | VPC Peering if 2-4, Transit Gateway if 5+ | Cost + complexity tradeoff |
| Multi-region replication | Transit Gateway peering | Simpler at scale |
| Public-facing API | CloudFront in front of ALB | Egress cost win + caching |
| GitHub Actions runners egress | NAT Gateway | Most CI/CD jobs hit external APIs |
| Lambda calling AWS services | Lambda runs in VPC + Interface Endpoints | Lambdas in private subnets save NAT charges |
Hidden Costs Most Architecture Diagrams Miss
Hidden Cost 1: Cross-AZ Pod Traffic in EKS
When you don't use topology-aware service routing, EKS pods talk to services in random AZs. Each cross-AZ hop costs $0.01/GB. For chatty microservices, this can be 30-50% of cluster network cost.
Mitigation: Use topologyKeys: ["topology.kubernetes.io/zone"] on services and Karpenter zone-aware scheduling. Aim for >70% same-AZ traffic.
Hidden Cost 2: ECR Image Pull Storm on Auto-Scaling
When Karpenter or HPA scales out, every new pod pulls images. A 1GB image x 100 pods scaling out = 100GB through NAT = $4.50 per scale event. We have seen workloads spend $1,000+/month just on scale-up image pulls.
Mitigation: ECR Interface Endpoint (mandatory at scale). Plus image caching on nodes (Karpenter's image cache helps).
Hidden Cost 3: NAT Gateway In Dev/Test Accounts (Forgotten)
Engineers spin up dev VPCs with NAT Gateways and forget to remove them. We routinely find $30-100/month NAT Gateways in dev accounts that haven't been used in months.
Mitigation: Tag policy + periodic cleanup. Use AWS Trusted Advisor's idle NAT Gateway check.
Hidden Cost 4: Interface Endpoint Hourly Fee Outweighs Savings For Low-Traffic Services
Adding an Interface Endpoint for a service used 100MB/month costs $7.30/AZ in fixed fees. Unless you save more than $7.30 in NAT charges, you've made it worse.
Mitigation: Calculate ROI per endpoint. The breakeven is roughly 5-10GB/month per AZ.
Hidden Cost 5: VPC Peering Cross-AZ Fees At Scale
VPC Peering is "free" until you cross AZs. For chatty inter-VPC traffic, peering fees compound. Transit Gateway's $0.05/GB beats peering's $0.01/GB cross-AZ at high mesh complexity (when peering route management itself is the cost).
Mitigation: Transit Gateway when peering complexity > 4-5 VPCs.
Hidden Cost 6: PrivateLink Endpoint Fees For Multi-Account SaaS
If you have multiple AWS accounts each connecting to the same SaaS via PrivateLink, you pay endpoint fees in each account. For 10 accounts x 3 AZs = 30 endpoints x $7.30 = $219/month just to connect to one SaaS.
Mitigation: Centralize PrivateLink endpoints in a shared services account, route through Transit Gateway.
Hidden Cost 7: CloudFront Skipped For Public APIs
Many teams put their public API behind an ALB and let NAT/internet egress handle return traffic. CloudFront in front of the ALB caches responses (saving compute) and routes egress through CloudFront's optimized network (often cheaper than ALB egress).
Mitigation: Always evaluate CloudFront for public APIs. For high-traffic APIs, savings are 30-60%.
Migration Playbook: Reducing NAT Gateway Spend
For VPCs with NAT Gateway costs over $5,000/month, this playbook captures 50-80% savings in 4-8 weeks.
Phase 1: Visibility (Week 1)
- Enable VPC Flow Logs (if not already on)
- Use VPC Flow Logs Insights or query via Athena to identify top destinations by bytes
- Categorize destinations: AWS services, SaaS providers, internet APIs
- Calculate NAT charges per category
Phase 2: Free Wins (Week 2)
- Add Gateway Endpoint for S3 in every VPC (5-minute change, free)
- Add Gateway Endpoint for DynamoDB if used (same)
- Update route tables to direct traffic through endpoints
- Verify traffic shifted via VPC Flow Logs
Typical immediate savings: 20-30% of NAT charges.
Phase 3: High-ROI Interface Endpoints (Weeks 3-4)
- Identify top AWS services by NAT data volume (usually ECR, CW Logs, SSM, Secrets Manager)
- Add Interface Endpoints for each, in production VPCs first
- Verify traffic shifts via Flow Logs
- Decommission unused IP space and update Security Groups
Typical incremental savings: 30-40% of remaining NAT charges.
Phase 4: SaaS PrivateLink (Weeks 5-6)
- Identify SaaS providers consuming significant NAT traffic
- Check if each supports PrivateLink (most major ones do: Datadog, Snowflake, Confluent, MongoDB, etc.)
- Set up PrivateLink endpoints in production VPCs
- Update application configs to use PrivateLink endpoint URLs
- Validate, then decommission NAT routes for those services
Typical incremental savings: 10-20% of remaining NAT charges.
Phase 5: Multi-VPC Consolidation (Weeks 7-8)
- Audit VPC Peering mesh complexity
- If 5+ VPCs with mesh peering, plan Transit Gateway migration
- Implement TGW with route segmentation for security boundaries
- Decommission old peering connections after validation
Typical savings: 5-15% on inter-VPC traffic.
Outcome
After completing all phases:
- NAT Gateway data processing typically reduced by 60-80%
- Network-related operational complexity reduced (fewer peering relationships)
- Better security posture (private connectivity for AWS service traffic)
- Engineering team trained on cost-aware network design
When NAT Gateway Is Actually The Right Answer
Don't try to eliminate NAT Gateway entirely. NAT Gateway is the right tool for:
- External APIs that don't support PrivateLink (most SaaS, payment processors, identity providers)
- Egress to public package repositories (PyPI, npm, Maven) unless you operate a mirror
- One-off internet calls from infrastructure (Auto Scaling triggers, custom integrations)
- Low-traffic outbound where Interface Endpoint hourly fee exceeds savings
- Egress monitoring/scanning for security (NAT Gateway can be filtered through security tools)
For these cases, NAT Gateway is fine — just minimize the traffic that flows through it.
A 30-Day AWS Network Cost Audit
If your AWS network bill (NAT Gateway + data transfer) is over $10,000/month, run this audit. We typically find 50-80% savings.
Week 1: Baseline + VPC Flow Logs
- Pull 90 days of network-related charges from Cost Explorer
- Enable VPC Flow Logs in all production VPCs
- Set up Athena tables on Flow Log S3 bucket
- Identify top 20 destinations by bytes processed
Week 2: Categorize Traffic
For top destinations, label as:
- AWS service (S3, DynamoDB, SSM, ECR, etc.) — endpoint candidate
- SaaS provider — PrivateLink candidate
- Public internet API — legitimate NAT
- Internal cross-VPC — peering/TGW analysis
- Cross-AZ same VPC — topology fix
Week 3: Plan + Estimate Savings
For each category:
- Calculate cost on optimal architecture
- Estimate migration effort (engineering hours)
- Calculate annual savings vs migration cost
- Prioritize by ROI
Week 4: Execute Quick Wins
- Deploy Gateway Endpoints for S3/DynamoDB everywhere (free, fast)
- Deploy ECR + CW Logs Interface Endpoints in production
- Set up PrivateLink for top SaaS providers
- Lock in baseline; plan deeper migrations for next quarter
The Bottom Line
AWS network costs in 2026 are hugely optimizable for most production deployments. The default architecture (NAT Gateway for everything) wastes 50-80% of network spend on traffic that VPC Endpoints, PrivateLink, or Transit Gateway would handle for free or near-free. The fixes are architectural, not application-code rewrites.
The discipline most teams skip: treating network architecture as a per-service cost decision rather than a one-time setup. The right architecture varies by traffic volume per service. Set up VPC Flow Logs visibility first, then optimize iteratively.
If your AWS network bill is over $20,000/month and you have not run a network architecture audit in the last 12 months, you are very likely overpaying by 50-80%. Our cloud cost optimization team runs free AWS network architecture audits and typically captures 50-70% savings within 60 days. Run a free Cloud Waste Scorecard to find your biggest network cost leaks first.
Further reading:
- The Real Cost of Data Transfer: NAT Gateways and Egress Fees
- Hidden AWS Bill: How AI Workloads Drain Budget Through NAT Gateway
- AWS Data Transfer Pricing Deep Dive 2026
- CDN Cost Showdown: CloudFront vs Cloudflare vs Bunny vs Fastly
- Best Cloud Storage by Workload Decision Framework 2026
- Kubernetes Rightsizing: VPA vs HPA vs KRR vs Karpenter
- Cloud Cost Optimization FinOps Service
- AWS NAT Gateway Pricing
- AWS PrivateLink Documentation
- AWS Transit Gateway Pricing



