Cloud Cost Optimization — Google Cloud

GCP Cost Optimization:
CUD vs SUD Strategy 2026

Google Cloud's automatic discounting model — Sustained Use Discounts that require zero commitment — creates a unique cost baseline that changes the calculus for Committed Use Discounts. This guide explains when each applies, and how to build the optimal GCP pricing strategy.

55%
3-Year CUD Discount
30%
Sustained Use Discount
91%
Spot VM Savings
37%
1-Year CUD Discount

This guide is part of the Cloud Cost Optimization: Enterprise FinOps Guide series, focusing on Google Cloud Platform (GCP). GCP's pricing model differs from AWS and Azure in important ways: automatic Sustained Use Discounts (SUDs) apply without any commitment, creating a built-in cost floor that changes how you should think about Committed Use Discounts (CUDs). For AWS and Azure tactics, see our AWS Cost Optimization and Azure Cost Management guides.

GCP's Automatic Discounting Model

Google Cloud applies Sustained Use Discounts (SUDs) automatically to Compute Engine N1, N2, and N2D machine types that run for more than 25% of a calendar month — with no action required from the customer. The discount increases progressively: resources running 25–50% of the month receive a 10% discount; 50–75% receives 20%; above 75% receives up to 30%. For most production workloads running continuously, SUDs deliver 20–30% savings off on-demand rates without any commitment purchase.

SUD Eligibility

Not all GCP machine types receive automatic SUDs. E2 instances (cost-optimised), N4, C3, and C4 machine families, as well as sole-tenant nodes and GPUs, have different discount structures. Always verify SUD eligibility for your specific machine type in the GCP pricing documentation before modelling savings. The general-purpose N1 and N2 families — which represent the majority of most enterprises' Compute Engine spend — receive full SUD treatment.

SUDs create an important baseline: even without purchasing CUDs, your production GCP compute workloads running 24/7 will receive approximately 30% automatic discount. This means the incremental value of CUDs is calculated against the SUD price, not the on-demand price — a distinction that changes the ROI analysis for commitment purchases.

CUD vs SUD: Which to Use When

Committed Use Discounts (CUDs) provide deeper discounts (37% for 1-year, 55% for 3-year) in exchange for a commitment to a specific amount of compute resources (vCPU and memory) in a specific region. The key decision is whether the incremental discount above your automatic SUD is worth the inflexibility of a multi-year commitment.

Expert Advisory

Want independent help negotiating better terms? We rank the top advisory firms across 14 vendor categories — free matching, no commitment.

Scenario SUD Only 1-Year CUD 3-Year CUD Recommendation
Stable production, long-term~30%37%55%3-year CUD
Growing production workload~30%37%55% (risk of over-commit)1-year CUD, conservative size
Variable / autoscaling workload~20–30%Higher cost riskHigher cost riskSUD only + Spot for variable
Short-lived project workloadPro-rated SUDWasted commitmentWasted commitmentOn-demand or Spot

The GCP CUD strategy principle: commit only to the resource volume you are confident will run for the full commitment period. For stable production environments running N1 or N2 compute for 2+ years, 3-year CUDs deliver 25 percentage points more savings than SUDs alone — worth the commitment. For growing workloads, under-commit and use SUDs for the growth layer. CUDs cannot be cancelled or resized, and unused committed resources are billed regardless of utilisation.

CUD Over-Commitment Risk

Unlike AWS Savings Plans (which flex across instance types) or Azure RIs (which have instance size flexibility), GCP CUDs commit to a specific machine type, vCPU, and memory configuration in a specific region. If you downsize or decommission the committed resource, you continue to pay the CUD charge until the commitment expires. Model your CUD purchases against conservative (P10) usage projections, not P50 or peak usage estimates.

Spot VMs and Preemptible Instances

GCP Spot VMs (the current product; originally Preemptible VMs) offer discounts of 60–91% compared to on-demand pricing, at the cost of potential preemption with 30-second shutdown notice. Spot pricing is dynamic — it varies by machine type and region — but is typically 60–80% below on-demand for N2 and E2 families.

Spot VMs are well-suited to: batch data processing (Dataflow, Spark on Dataproc), ML training jobs (GPUs on Spot deliver exceptional value), CI/CD build infrastructure, rendering workloads, and any batch task that can checkpoint state and restart after preemption. Google's Dataproc and GKE services have native Spot VM integration, making Spot adoption straightforward for organisations already using these managed services.

GKE Node Pools with Spot VM spot instances are a powerful pattern: configure a primary node pool with CUD-covered on-demand nodes for stable baseline workloads, and a secondary Spot node pool with autoscaling for burst and batch workloads. This hybrid approach can reduce GKE compute costs by 40–60% compared to all-on-demand configurations. See our dedicated guide to Kubernetes Cost Optimisation.

BigQuery Cost Optimisation

BigQuery is one of the largest cost drivers in GCP-heavy enterprises — particularly those running analytics, data warehouse, or machine learning workloads. BigQuery offers two pricing models: on-demand (pay per TB of data scanned) and flat-rate (pay per reservation slot-hour, regardless of bytes scanned). Choosing the right model — or combining them — is one of the highest-impact GCP cost decisions.

Free Resource

Get the IT Negotiation Playbook — free

Used by 4,200+ IT directors and procurement leads. Oracle, Microsoft, SAP, Cloud — all covered.

BigQuery: On-Demand vs Flat-Rate
When Flat-Rate Pricing Becomes the Better Choice
BigQuery on-demand charges $6.25/TB scanned (as of 2026, varies by region). For organisations running exploratory queries over large datasets, on-demand is typically cost-effective. The break-even point for flat-rate (100 slot reservation = $2,000/month) is approximately 320 TB of monthly data scanning at on-demand rates. Organisations scanning more than 400 TB/month consistently benefit from flat-rate pricing. Additionally, flat-rate enables better cost predictability and removes the risk of runaway query costs from accidental full-table scans.

Key BigQuery cost reduction tactics beyond pricing model selection:

  • Partitioning and clustering: Partitioned tables (by date, ingestion time, or integer range) dramatically reduce bytes scanned per query — typically by 80–95% for time-series queries. Clustering on frequently filtered columns reduces bytes scanned further. These are architectural improvements that pay dividends indefinitely.
  • Column selection: BigQuery charges by bytes scanned; selecting only required columns (avoiding SELECT *) directly reduces query cost in proportion to the columns excluded.
  • Materialised views and caching: BigQuery caches identical query results for 24 hours at no charge. Materialised views pre-compute and store query results, reducing downstream query scan volumes for dashboards and reports.
  • BigQuery BI Engine: For interactive dashboard queries (Looker, Looker Studio), BI Engine provides in-memory analysis that bypasses standard slot consumption, dramatically reducing slot costs for high-frequency, low-latency query workloads.

GKE and Kubernetes Cost Management

Google Kubernetes Engine is increasingly the default compute platform for cloud-native workloads on GCP. GKE-specific cost optimisation encompasses cluster configuration, node pool design, and workload right-sizing:

GKE Autopilot: Autopilot mode eliminates node management overhead entirely — you pay per Pod resource request (CPU and memory) rather than per node. For teams with heterogeneous workloads, Autopilot typically reduces compute costs by 20–35% compared to self-managed node pools because idle node capacity is eliminated. For workloads requiring specific hardware or operating system configuration, Standard mode with optimised node pools remains more appropriate.

Cluster autoscaling and node auto-provisioning: Standard mode GKE with cluster autoscaler and node auto-provisioning (NAP) dynamically adjusts node counts and machine types to fit workload demands. NAP selects the most cost-efficient machine type for each workload based on resource requests, often choosing E2 cost-optimised instances for less demanding workloads automatically.

Vertical Pod Autoscaler (VPA): VPA analyses historical resource utilisation and adjusts Pod CPU/memory requests to right-size containers. Combined with Cluster Autoscaler, VPA drives node consolidation by eliminating over-provisioned Pods that hold excess node capacity. Many enterprises find VPA identifies 30–50% of containers as significantly over-requested.

Cloud SQL and Database Services

Cloud SQL instances are often over-provisioned for peak load and run at low average utilisation — a common pattern from teams that size for peak rather than average demand. Cloud SQL CUDs are available for 1 or 3-year terms and provide 25–52% savings over on-demand for stable production databases. Database right-sizing should precede any CUD purchase for Cloud SQL.

For dev and staging databases with intermittent usage, Cloud SQL's automatic storage increase feature should be combined with scheduled start/stop via Cloud Scheduler to stop instances outside business hours — Cloud SQL charges even when idle unless stopped. Alloy DB, Google's PostgreSQL-compatible managed database for high-performance workloads, has its own CUD programme that provides similar discount structures for committed use.

GCP CUD Agreement Negotiation

At enterprise scale ($1M+ annual GCP spend), Google offers committed use agreements that bundle CUDs with negotiated rates, support tier credits, migration assistance, and Google Workspace or Google Cloud AI/ML credits. These agreements — negotiated directly with Google's enterprise sales team — can provide additional discount headroom beyond the standard published CUD rates.

Spending $1M+ on GCP? An independent advisor can help you negotiate CUD terms and identify optimisation opportunities.

500+ cloud engagements, Gartner recognised — no vendor conflicts.
Get Matched →

GCP enterprise commitment negotiation tactics:

  • Use AWS and Azure competition: Google is highly motivated to win and retain enterprise cloud spend. Demonstrating credible alternatives — and quantifying the migration cost to GCP as an investment Google is making — can unlock additional credits and discount headroom.
  • Bundle AI/ML commitments: Google's competitive differentiation is increasingly around Vertex AI, BigQuery ML, and Gemini. Committing to AI/ML workloads on GCP gives Google an additional business reason to offer better commercial terms on Compute Engine commitments.
  • Request migration credits: For workloads migrating from AWS or on-premises, Google offers migration assistance programmes that include engineering support, partner credits, and consumption credits to offset migration costs.
  • Negotiate support tier inclusion: Google's Premium Support ($150/month minimum + 3-4% of monthly spend) is expensive. Securing Premium Support inclusion or credit within a CUD agreement is a significant incremental value for enterprises that need it.

See our comparative guide to Negotiating Cloud Enterprise Discount Programs for a side-by-side comparison of AWS EDP, Azure MACC, and GCP CUD agreement structures and benchmark discount rates.

GCP Savings Summary Table

Optimisation Lever Savings Potential Effort Key Consideration
Sustained Use Discounts (automatic)Up to 30%None requiredN1, N2, N2D families only
1-Year CUDs37%MediumInflexible — model conservatively
3-Year CUDs55%MediumHighest risk; for stable workloads only
Spot VMs60–91%MediumInterruptible; batch/CI/CD workloads
BigQuery partitioning80–95% scan reductionMediumRequires table redesign
BigQuery flat-rate pricingVariableLowBreak-even ~400TB/month scanned
GKE Autopilot20–35% computeMediumMigration effort for existing clusters
VPA rightsizing30–50% on over-requested PodsMediumRequires testing for stateful workloads
Cloud SQL scheduling60–70% dev/test SQLLowNon-production only
GCP enterprise CUD agreement10–20% beyond CUD ratesHigh$1M+ annual spend required

Frequently Asked Questions

Should I buy GCP CUDs if I already get Sustained Use Discounts?
Yes, for stable production workloads — but only after confirming your workload stability. The incremental saving from CUDs over SUDs is 7–25 percentage points (depending on 1 vs 3-year term), which is significant for large compute spend. The key risk is over-commitment: GCP CUDs cannot be resized or cancelled, and unused CUD capacity is billed regardless. Commit conservatively against your stable baseline, let SUDs cover the variable growth layer, and use Spot VMs for burst and batch workloads.
What GCP machine types are most cost-efficient for general workloads?
E2 machine types are the most cost-efficient for CPU-bound and general-purpose workloads — they offer similar performance to N2 at 20–30% lower price, though without SUD eligibility. N2 and N2D are preferred when SUD eligibility is important for your pricing model. C3 and C4 are the newest compute-optimised families with higher per-core performance but different pricing. For memory-intensive workloads, M2 and M3 provide the highest memory-to-vCPU ratios. Always model total cost including sustained use or CUD discounts, not just the on-demand rate.
How do I control BigQuery costs for ad-hoc query users?
BigQuery's on-demand model creates cost exposure when analysts run unoptimised queries against large tables. Key controls include: (1) Column-level security and row-level security to limit what data users can access; (2) Partitioning requirement enforcement — query jobs can be rejected if they don't include partition filters; (3) Custom quotas per user or per project limiting daily scan volume; (4) Cost-based query controls using Information Schema to alert when high-cost queries are submitted. For teams with heavy BI usage, migrating interactive queries to BigQuery BI Engine or flat-rate reservations eliminates the per-scan cost risk entirely.
How does GCP SAP pricing work, and how does it relate to cost optimisation?
GCP has a specific certified SAP HANA instance catalogue (n1-highmem-96, m2-ultramem series, etc.) with SAP-certified configurations. These instances are generally eligible for CUDs, making long-running SAP HANA production databases strong CUD candidates. Google also offers RISE with SAP on GCP as a deployment option — see our guide to SAP on Google Cloud Negotiation for the commercial dynamics of combining GCP infrastructure commitments with SAP RISE contract negotiations.

Ready to Reduce Your GCP Bill?

Connect with an independent Google Cloud cost expert who can benchmark your CUD strategy, optimise BigQuery costs, and negotiate your enterprise CUD agreement.