How do I reduce my Google Cloud costs in 2026?

The fastest GCP savings come from four changes: (1) Buy Committed Use Discounts (CUDs) for steady-state Compute Engine and GKE workloads, which cuts compute cost 17-29% with no operational changes. (2) Move BigQuery from on-demand ($6.25/TB scanned) to flat-rate slots for predictable workloads, saving 30-50%. (3) Set Cloud Run min-instances to zero where cold-start latency is acceptable. (4) Switch from Premium (Tier 1) to Standard (Tier 2) network tier for non-latency-critical egress. Combined, these cut typical GCP bills 40-50% within a week.

What are GCP Committed Use Discounts and how do they work?

Committed Use Discounts (CUDs) are GCP's equivalent of AWS Reserved Instances and Savings Plans, but more flexible. There are two types: Resource-based CUDs (commit to specific machine types) save 25-29% for 1-year and 52-57% for 3-year commitments. Spend-based CUDs (commit to a dollar amount of compute usage) save 17% for 1-year and 28% for 3-year, applying flexibly across machine types. Spend-based CUDs are usually the better choice because they don't lock you to a specific machine family. CUDs apply automatically once purchased.

Is BigQuery on-demand or flat-rate pricing better?

It depends on workload predictability. BigQuery on-demand at $6.25/TB scanned is best for unpredictable ad-hoc queries where you pay zero when idle. BigQuery flat-rate slots (or BigQuery Editions) are better when you have steady continuous query workloads. The breakeven is roughly 100 slots ($6,000/month for Standard Edition) running 16+ hours/day worth of queries. For mixed workloads, use a combination: flat-rate for production dashboards, on-demand for analyst queries. Most teams using only on-demand at scale leave 30-50% savings on the table.

Should I use GKE Standard or GKE Autopilot?

GKE Standard charges $0.10/cluster/hour (one cluster free per project) plus you manage and pay for the underlying nodes. GKE Autopilot abstracts node management and charges per-pod resources at a 30% premium over Standard. For most production workloads with a dedicated platform team, Standard is cheaper. Autopilot wins when you have small clusters, lack platform engineering capacity, or want to eliminate node-level management entirely. The 30% premium is rarely worth it once you have 10+ services.

Why are my GCP egress costs so high?

GCP egress is expensive for two reasons. First, GCP charges $0.08-$0.12/GB for general internet egress, comparable to AWS but often surprising teams who picked GCP for cost reasons. Second, the default 'Premium Tier' (Tier 1) network charges 2-3x what 'Standard Tier' (Tier 2) costs, but many teams use Premium because it's the default. Third, cross-region traffic in GCP is $0.01-$0.08/GB depending on regions. Use Standard Tier for non-latency-critical workloads, route through Cloudflare CDN for free egress on cached content, and consolidate traffic to single regions where possible.

Back to Engineering Insights

Cloud Cost Optimization

May 18, 2026

By Ravi Kanani

11 GCP Cost Levers Most Teams Miss in 2026 (And How To Fix Each One This Week)

Key Takeaway

GCP teams typically overpay by 40-60% due to: missing CUDs (Committed Use Discounts) on compute (12% savings), defaulting to per-second BigQuery on-demand instead of slot reservations (35% savings on heavy queries), unused Cloud Run min-instances (8%), Tier 1 networking when Tier 2 would suffice (15-25% network savings), and 7 other recurring patterns. Fixing the top 4 typically cuts GCP costs 40-50% within a week. Fixing all 11 cuts costs 55-70%.

We Audited 38 GCP Accounts. Average Bill Was 55% Higher Than Needed.

A growth-stage AI startup we worked with in early 2026 was running their entire stack on Google Cloud. Their CTO had picked GCP three years earlier "because it was cheaper than AWS." Their monthly GCP bill: $67,000. Their AWS counterpart with similar workload would have cost roughly $58,000 — meaning the "cheaper" cloud was actually 16% more expensive.

We audited their setup. The findings were typical:

No Committed Use Discounts (paying full on-demand for predictable steady-state workloads worth $34K/month)
BigQuery on-demand for everything including 200+ hours/day of dashboard refreshes
Premium Tier networking by default for all egress, much of which was internal/non-latency-critical
GKE clusters running all on-demand with no preemptible/Spot configuration
Cloud Run services running 24/7 min-instances for traffic that was sporadic and tolerant of cold starts
Cross-region traffic patterns that paid $0.08/GB unnecessarily
Stale dev/test projects running compute that nobody had touched in 6+ months
GKE Autopilot for production workloads where Standard would have been 30% cheaper

After 8 weeks of changes, their bill dropped to $28,000/month. Annual savings: $468,000. No application code rewrites. Just configuration and commitment optimization.

This pattern is consistent across 38 GCP audits we ran in 2025-2026: the average GCP bill is 55% higher than it should be. The reasons are structurally different from AWS: GCP's commitment system requires proactive purchase (no equivalent to AWS Compute Savings Plans applying automatically), BigQuery's two pricing models confuse most teams, and GCP's network tier system isn't well documented for cost-conscious users.

GCP is not always cheaper than AWS. It can be when configured well. Most teams configure it badly. This post is the actual fix list — 11 specific levers, with real cost math and concrete fixes you can apply this week.

The 11 Hidden GCP Cost Levers

Across the 38 audits, these are the patterns we find. Numbers show how often each pattern occurred and the typical savings.

#	Lever	Found in	Typical Savings
1	Missing Committed Use Discounts	84% of accounts	17-29% on compute
2	BigQuery on-demand for steady workloads	71% of accounts	30-50% on BQ
3	Premium (Tier 1) network when Standard suffices	66% of accounts	15-25% on egress
4	Cloud Run min-instances misconfigured	58% of accounts	8-15% on Cloud Run
5	GKE on-demand without preemptible/Spot	53% of accounts	30-60% on GKE compute
6	Cross-region traffic that could be co-located	47% of accounts	10-25% on network
7	Cloud Storage class misuse (Standard for cold)	42% of accounts	30-70% on cold storage
8	Dataflow shuffle disk over-provisioning	37% of accounts	15-30% on Dataflow
9	Stale dev/test projects accumulating cost	34% of accounts	5-15% overall
10	Persistent Disk type mismatch (SSD where Standard works)	32% of accounts	40-60% on PD
11	GKE Autopilot for workloads better on Standard	24% of accounts	25-30% on GKE

The numbers don't add to 100% because they overlap. Fixing the top 4 typically cuts costs 40-50%.

Lever 1: Missing Committed Use Discounts (Highest-Impact Lever)

The trap: GCP offers Committed Use Discounts (CUDs) that save 17-57% on compute, but unlike AWS Compute Savings Plans, GCP does not auto-apply commitments to your usage. You must proactively purchase them, and most teams never do.

Why teams overpay: GCP's billing console buries CUDs under a separate "Commitments" section. The default experience is full on-demand pricing. Engineering teams assume "we'll handle commitments later" and never do.

The fix: Purchase Spend-based CUDs for your steady-state compute baseline.

1-year Spend-based CUD: 17% discount across all eligible compute
3-year Spend-based CUD: 28% discount
1-year Resource-based CUD (specific machine types): 25-29% discount
3-year Resource-based CUD: 52-57% discount

Real cost math:

$20,000/month steady-state Compute Engine spend, no CUDs
After 1-year Spend-based CUD: $20,000 × 0.83 = $16,600/month
Savings: $3,400/month, $40,800/year, no operational changes

For larger spend ($100K+/month), 3-year Resource-based CUDs on stable workloads save 52-57%. The math is unambiguous: if you can predict your baseline compute usage 12 months out (most production teams can), buy CUDs.

How To Calculate The Right Commitment Size

Pull last 90 days of compute spend, broken down by machine family
Identify the 75th percentile of monthly usage (the level you're at or above 75% of the time)
Commit to that usage level via 1-year Spend-based CUD
Re-evaluate every 6 months and add commitment as baseline grows

The risk is over-committing and paying for capacity you don't use. The 75th percentile rule keeps you at ~95% utilization of commitments.

Lever 2: BigQuery On-Demand for Steady Workloads

The trap: BigQuery has two completely different pricing models. On-demand bills $6.25 per TB scanned (only when you query). Flat-rate bills per slot-hour ($0.04-$0.10/slot-hour depending on Edition). Most teams use only on-demand and overpay heavily on production dashboards.

Why teams overpay: Defaults to on-demand. Pricing pages emphasize on-demand simplicity. Teams don't realize that 100 slots running 16 hours/day costs less than the equivalent on-demand workload above ~5TB/day.

The fix: Audit BigQuery query patterns. If you have steady continuous queries (production dashboards, ETL pipelines, scheduled reports), move them to flat-rate (BigQuery Editions). Keep ad-hoc analyst queries on on-demand.

Real cost math:

Production dashboards scan 800TB/month at on-demand: $5,000/month
Same workload on Standard Edition with 100 slots: $0.04 × 100 × 720 = $2,880/month
42% savings

For larger workloads ($20K+/month BigQuery), Enterprise Edition with autoscaling slots delivers further savings via predictable pricing for most queries plus on-demand for spikes.

Mixed-Mode BigQuery Architecture

The best architecture for teams over $5K/month BigQuery spend:

Production dashboards and ETL: Flat-rate slots (Standard or Enterprise Edition)
Ad-hoc analyst queries: On-demand pricing
Heavy ML training queries: Auto-scale slots for the duration

This pattern cuts BigQuery costs 30-50% vs all-on-demand at scale.

Lever 3: Premium Network Tier When Standard Suffices

The trap: GCP's "Premium Tier" (Tier 1) network is the default and uses Google's high-performance backbone. "Standard Tier" (Tier 2) uses public internet routes between regions and costs 2-3x less. Most teams pay Premium without measuring whether they need it.

Why teams overpay: Premium is default. Documentation emphasizes performance benefits without explaining the cost difference. Teams don't realize Standard exists until they look at network bills closely.

The fix: Identify which traffic actually needs sub-millisecond inter-region latency. Most internal traffic (database replication, log shipping, backup transfers) does not. Configure Standard Tier for those workloads via VPC routing.

Real cost math:

50TB/month egress on Premium Tier: $0.12/GB × 50,000GB = $6,000
Same egress on Standard Tier: $0.04/GB × 50,000GB = $2,000
Savings: $4,000/month, 67% on egress

Standard Tier doesn't compromise customer-facing performance — only inter-region traffic where end-user latency isn't relevant.

Lever 4: Cloud Run Min-Instances Misconfigured

The trap: Cloud Run's min-instances setting keeps containers warm to eliminate cold starts. Each min-instance bills 24/7 at always-on rate ($0.0000648/vCPU-second). Most teams set min-instances "for safety" without measuring whether traffic justifies it.

Why teams overpay: "Just in case" defensive configuration. min-instances=1 looks small but costs $46/month per service per vCPU. Across 40 services, that's $1,840/month for warm capacity that's often unused.

The fix: For each Cloud Run service:

Check actual traffic patterns
Measure cold-start frequency and impact
If cold-start latency under 1 second is acceptable, set min-instances=0
If you genuinely need warm capacity, set min-instances based on p95 traffic, not p99

Real cost math:

40 services × min-instances=1 × 1 vCPU × $0.0000648 × 720 hours = $1,866/month
Reduce to min-instances=0 on 30 sporadic services: saves $1,400/month
Keep min-instances=1 on 10 latency-critical services: $466/month

For most teams, min-instances=0 with traffic-based autoscaling is the right default.

Lever 5: GKE On-Demand Without Preemptible/Spot

The trap: GKE clusters default to standard on-demand VM nodes. GCP offers Preemptible VMs (now called Spot VMs in newer GKE versions) at 60-91% discount, but you must explicitly configure node pools to use them.

Why teams overpay: Spot VMs require workload tolerance for interruption (24-hour max lifetime, 30-second termination notice). Teams skip this complexity and run all on-demand.

The fix: Add Spot VM node pools to GKE clusters and route appropriate workloads:

Stateless web tier: 70% Spot
Batch jobs: 90% Spot
CI/CD runners: 95% Spot
Stateful services (databases, queues): 0% Spot

Real cost math:

$15,000/month GKE compute, all on-demand
Refactor: 60% on Spot at 80% discount = $7,200 × 0.20 = $1,440
Remaining 40% on-demand with CUD = $6,000 × 0.83 = $4,980
New total: $6,420/month vs $15,000 = 57% savings

(See our Spot Instances Decision Framework for which workloads tolerate Spot.)

Lever 6: Cross-Region Traffic That Could Be Co-Located

The trap: Multi-region GCP deployments accumulate inter-region traffic charges ($0.01-$0.08/GB depending on regions). For chatty microservices spread across regions, this can be 30-40% of network spend.

Why teams overpay: Multi-region was set up for HA but actual traffic patterns weren't analyzed. Services that don't need geographic distribution end up communicating cross-region for no benefit.

The fix:

Use VPC Flow Logs to identify cross-region traffic
Co-locate chatty service pairs in the same region
Use multi-region only for genuinely region-redundant services (databases with replicas, primary/standby topologies)

Real cost math:

25TB/month cross-region traffic at $0.04/GB: $1,000/month
After consolidating chatty services: 5TB cross-region = $200/month
Savings: $800/month

Lever 7: Cloud Storage Class Misuse

The trap: Cloud Storage has 4 classes: Standard, Nearline, Coldline, Archive. Most teams put everything in Standard ($0.020/GB/month) when older data could live in Nearline ($0.010/GB/month), Coldline ($0.004/GB/month), or Archive ($0.0012/GB/month).

Why teams overpay: Default is Standard. Teams don't set up lifecycle rules to transition older data.

The fix: Add object lifecycle rules:

30 days → Nearline
90 days → Coldline
365 days → Archive

Real cost math:

100TB stored in Standard: $2,000/month
After lifecycle rules (assuming 30% Standard, 30% Nearline, 30% Coldline, 10% Archive):
- Standard 30TB × $20 = $600
- Nearline 30TB × $10 = $300
- Coldline 30TB × $4 = $120
- Archive 10TB × $1.20 = $12
- Total: $1,032/month, 48% savings

This pattern is identical to S3 lifecycle rules but most GCP teams don't configure it because GCS doesn't have the same Intelligent-Tiering automation.

Lever 8: Dataflow Shuffle Disk Over-Provisioning

The trap: Dataflow jobs default to large shuffle disk allocations regardless of actual job needs. For many ETL workloads, the default 400GB shuffle disk per worker is 5-10x what's needed.

Why teams overpay: Defaults are conservative. Teams don't tune --diskSizeGb parameter.

The fix: Profile actual shuffle requirements per job and reduce disk size accordingly. For many simple transformations, 50-100GB shuffle disk is sufficient. Use Dataflow's Streaming Engine to externalize shuffle entirely (eliminating worker disk altogether).

Real cost math:

100 worker job, 400GB shuffle disk each, running 8 hours/day
40TB total disk × $0.04/GB-month / 30 days × 8 hours / 24 hours = $17.78/day = $533/month
Same workload with 100GB disk: $133/month
Savings: 75%

For frequently-run jobs, this adds up fast.

Lever 9: Stale Dev/Test Projects Accumulating Cost

The trap: GCP's project structure makes it easy to spin up dev/test environments and forget them. Engineers create projects, deploy resources, and move on. The compute keeps running and billing.

Why teams overpay: Lack of project lifecycle governance. No automated cleanup. Cost ownership unclear.

The fix:

Tag every project with owner, purpose, expected lifetime
Set up monthly review of zero-traffic resources
Implement auto-shutdown for environments tagged as dev/test
Use GCP's Cost Anomaly Detection to flag unused projects

Real cost math:

12 dev/test projects averaging $300/month each = $3,600/month
After cleanup: 4 active projects at $200/month = $800/month
Savings: $2,800/month, 78%

This is pure waste — services nobody uses, generating bills.

Lever 10: Persistent Disk Type Mismatch

The trap: GCP Persistent Disks come in types: Standard ($0.04/GB/month), Balanced SSD ($0.10/GB/month), SSD ($0.17/GB/month), Extreme SSD (much higher). Most teams use SSD by default when Standard or Balanced would suffice for the workload.

Why teams overpay: SSD is fast and "feels professional." Teams don't measure whether their workload is actually IOPS-bound (where SSD matters) or throughput-bound (where Standard is fine).

The fix: Profile actual disk I/O per workload:

Throughput-bound (logs, backups, batch reads): Standard
Mixed read/write with moderate IOPS: Balanced SSD
High IOPS database storage: SSD
Real-time gaming, financial trading: Extreme SSD

Real cost math:

50TB on SSD: $8,500/month
After analysis: 30TB on Standard ($1,200) + 20TB on Balanced SSD ($2,000) = $3,200/month
Savings: $5,300/month, 62%

Lever 11: GKE Autopilot for Workloads Better on Standard

The trap: GKE Autopilot abstracts node management at a 30% premium. For small teams without platform engineering capacity, this is worth the cost. For teams running 50+ services with experienced platform engineers, GKE Standard is dramatically cheaper.

Why teams overpay: Autopilot is marketed as "easier" without clear cost comparison. Teams pick it for new clusters and never reconsider.

The fix: For each GKE cluster, evaluate:

Cluster size and complexity (Standard wins above ~30 services)
Platform team capacity (Autopilot wins if you have less than 1 FTE)
Workload variability (Standard wins for steady, Autopilot for sporadic)

Real cost math:

50-service production cluster on Autopilot: $12,000/month effective
Same cluster on Standard with proper node pool config: $9,000/month
Savings: $3,000/month, 25%

The migration takes 2-4 weeks but the savings are permanent.

The Decision Framework: 5 Questions Before Adding GCP Resources

When deploying new resources on GCP, ask:

Question 1: What is the steady-state baseline?

If the workload runs continuously, calculate the 75th percentile usage and commit via Spend-based CUD before deployment. CUDs apply to existing usage too — you don't need to wait.

Question 2: Does this workload need Premium Tier networking?

Default to Standard Tier unless you have a measured latency requirement. Most internal traffic, database replication, and backup transfers work fine on Standard.

Question 3: Is this workload truly stateless and interruption-tolerant?

If yes, configure Spot VMs (60-91% savings). If you're running stateless workers on standard VMs without exploring Spot, you're leaving major savings on the table.

Question 4: What is the real BigQuery query pattern?

For predictable production workloads, calculate flat-rate slot cost vs on-demand. Teams running 4+ hours/day of consistent queries should be on flat-rate.

Question 5: What persistence/cooling tier is appropriate?

For Cloud Storage, set lifecycle rules from day one. For Persistent Disk, measure IOPS requirements before defaulting to SSD.

The 7-Day GCP Cost Audit

If your GCP bill is over $10,000/month, run this audit:

Day 1: Visibility

# Pull commitment utilization
gcloud compute commitments list --format=table

# Pull network tier breakdown
gcloud compute project-info describe --format="value(commonInstanceMetadata.items.network-tier)"

# Pull BigQuery slot usage
bq show --location=US --format=prettyjson reservation/projects/YOUR_PROJECT/locations/US/reservations

Identify spend by service via Cost Tools dashboard. Find the top 5 cost drivers.

Day 2-3: Commitment Buy

Calculate baseline compute spend (75th percentile of last 90 days)
Buy 1-year Spend-based CUD for that level
Document expected savings

Day 4: Network Tier Audit

Identify Premium Tier traffic via VPC Flow Logs
Configure Standard Tier for non-latency-critical workloads
Update VPC routing accordingly

Day 5: BigQuery Optimization

Audit query patterns by hour and day of week
Identify candidates for flat-rate slots
Purchase appropriate Editions/slots

Day 6: Compute Right-Sizing

Check Recommender for VM right-sizing recommendations
Apply rightsizing for top spenders
Configure Spot VMs for stateless workloads

Day 7: Storage and Disk Optimization

Add Cloud Storage lifecycle rules
Audit Persistent Disk types vs actual usage
Schedule monthly review going forward

After this 7-day audit, expect 35-55% cost reduction within the first billing cycle.

When GCP Is The Wrong Choice (Be Honest)

GCP is excellent for some workloads and a poor fit for others. Be honest:

GCP wins for:

BigQuery-heavy data workloads (no AWS equivalent at the price/performance)
AI/ML workloads using Vertex AI and TPUs
Cloud Run-based serverless containers
Teams already on Google Workspace

GCP loses to AWS or Cloudflare for:

General object storage and CDN (Cloudflare R2 + CDN dominates on cost)
Container orchestration at scale (EKS has more mature ecosystem despite higher cost)
Mature compliance frameworks (AWS has more certifications, more regions)
Multi-region deployments outside North America/Western Europe

If your only reason to use GCP is "it's cheaper," you may be on the wrong cloud. Match cloud to workload, not to perception.

The Bottom Line

GCP is a competitive cloud when configured well and an expensive cloud when configured badly. The default experience leans toward overspending: no automatic commitment application, Premium Tier networking by default, BigQuery on-demand by default, and Standard storage class for everything.

The discipline most GCP teams skip: treating commitment management, network tier selection, and BigQuery optimization as continuous practices rather than one-time setup decisions. Run quarterly reviews. Re-evaluate commitments as your baseline grows. Move BigQuery workloads between on-demand and slots as patterns evolve.

If your GCP bill is over $20,000/month and you haven't audited commitments, network tiers, and BigQuery patterns in the last 6 months, you are very likely overpaying by 40-60%. Our cloud cost optimization team runs free GCP audits and typically captures 40-55% savings within 60 days. Run a free Cloud Waste Scorecard to find your biggest GCP cost leaks first.

Further reading:

Frequently Asked Questions

Stop Overpaying for Cloud Infrastructure

Our clients save 30-60% on their cloud bill within 90 days. Get a free Cloud Waste Assessment and see exactly where your money is going.

Free Cloud Waste Assessment Our Services

Related Insights

View All

Cloud Cost Optimization

May 19, 2026

Cloud Cost Anomaly Detection in 2026: Why Your Current Setup Misses 70% of Spikes

Cost anomaly detection is the easiest FinOps capability to deploy and the hardest to deploy correctly. We tracked 12,000 production cost anomalies across 47 accounts and found native AWS Cost Anomaly Detection caught only 31% of true cost spikes, with average detection lag of 18 days from spike onset. This post is the decision framework for building anomaly detection that catches spikes within hours, not weeks.

Cloud Cost Optimization

May 19, 2026

FinOps for AI Workloads in 2026: Why Traditional Cloud FinOps Practices Fail On LLMs

Traditional FinOps practices were built around predictable cloud workloads (EC2, RDS, S3) that scale linearly with users. AI workloads break every assumption: token costs scale with prompt complexity not user count, agentic loops multiply spend 50-100x, and Cost Explorer cannot allocate per-customer for shared LLM API calls. We rebuilt FinOps practice for 23 AI companies in 2025-2026 and learned the 7 traditional FinOps practices that fail on AI workloads.

Cloud Cost Optimization

May 19, 2026

FinOps Maturity in 2026: The Crawl/Walk/Run Path Most Teams Skip Steps On

The FinOps Foundation's Crawl/Walk/Run framework is well-known but consistently misapplied. We tracked 80 FinOps programs from inception through year 2 and found 62% failed because they skipped the Crawl phase and tried to start at Walk or Run. This post is the actual maturity path with concrete capabilities at each phase, the failure modes that kill most programs, and how to build FinOps that survives leadership turnover.

View All Insights