Your Observability Bill Is Probably Your Third-Largest Cloud Cost. Let Us Fix That.
Here is something we hear in almost every cloud cost assessment: "We had no idea observability was this expensive."
At 200 hosts, a fully-featured Datadog deployment (infrastructure, APM, logs, synthetics) commonly costs $20,000-30,000 per month. That is not a typo. For many mid-stage startups, observability is the third-largest line item on their cloud bill after compute and databases.
Grafana Cloud has positioned itself as the cost-effective alternative, and the pricing difference is real. But "cheaper" does not always mean "better value," and switching observability platforms mid-production is not a weekend project.
This post gives you the actual numbers for both platforms in 2026, modeled at realistic scales, so you can make an informed decision before you sign a contract or commit engineering time to a migration. We will be honest about where each platform wins and where it falls short, because the right choice depends entirely on your team, your scale, and what you actually need from observability.
If your observability spend is already out of control, our cloud cost optimization team regularly helps teams reduce monitoring costs by 40-60% without sacrificing visibility.
Datadog Pricing in 2026: The Full Breakdown
Datadog uses a per-host pricing model for infrastructure and APM, combined with usage-based pricing for logs, metrics, and traces. The challenge is that almost every feature is a separate line item, and costs compound quickly when you enable multiple products.
Infrastructure Monitoring
| Plan | Monthly (Annual Billing) | Monthly (On-Demand) | Included |
|---|---|---|---|
| Pro | $15/host | $18/host | 100 custom metrics/host |
| Enterprise | $23/host | $27/host | 200 custom metrics/host |
APM (Application Performance Monitoring)
| Plan | Monthly (Annual Billing) | Monthly (On-Demand) | Included Spans |
|---|---|---|---|
| APM Pro | $31/host | $36/host | 150GB/month indexed spans |
| APM Enterprise | $40/host | $46/host | 200GB/month indexed spans |
Log Management
This is where Datadog gets expensive fast. Log pricing has three separate dimensions:
| Component | Rate |
|---|---|
| Log Ingestion | $0.10 per GB |
| 15-day Indexed Retention | $1.70 per million log events |
| 30-day Indexed Retention | $2.50 per million log events |
| Rehydration (from archive) | $0.10 per GB |
Why this matters: A moderately verbose application generating 100GB of logs per day costs $300/month just in ingestion. Add 15-day retention for indexed search, and you are looking at another $500-2,000/month depending on event density. Logs are consistently the biggest surprise on Datadog bills.
Additional Products (Each Billed Separately)
| Product | Starting Price |
|---|---|
| Synthetics (API tests) | $5.00 per 10K test runs |
| Synthetics (Browser tests) | $12.00 per 1K test runs |
| Real User Monitoring (RUM) | $1.50 per 1K sessions |
| Database Monitoring | $70/host/month |
| Network Performance | $5/host/month |
| Security Monitoring | $0.20 per GB analyzed |
| CI Visibility | $13/committer/month |
| Custom Metrics (beyond included) | $0.05 per custom metric |
Each of these is billed independently. A team that enables infrastructure, APM, logs, RUM, and synthetics can easily spend $80-120 per host per month before accounting for log volume.
Grafana Cloud Pricing in 2026: The Full Breakdown
Grafana Cloud takes a fundamentally different approach. It bundles the open-source Grafana stack (Grafana, Mimir for metrics, Loki for logs, Tempo for traces) into a managed service with usage-based pricing and a genuinely useful free tier.
Free Tier (No Credit Card Required)
| Component | Free Allowance |
|---|---|
| Metrics | 10,000 active series |
| Logs | 50 GB/month |
| Traces | 50 GB/month |
| Profiles | 50 GB/month |
| Users | 3 active users |
| Alerting | Included |
| Dashboards | Unlimited |
| Retention | 14 days (metrics), 30 days (logs/traces) |
This free tier is not a marketing gimmick. For a small team running 5-15 hosts with moderate logging, it genuinely covers basic observability needs.
Pro Tier
| Component | Rate |
|---|---|
| Metrics | $8.00 per 1,000 active series/month |
| Logs | $0.50 per GB ingested |
| Traces | $0.50 per GB ingested |
| Profiles | $0.25 per GB ingested |
| Users | Included with subscription |
| Base platform fee | $29/month |
Advanced/Enterprise Features
| Feature | Rate |
|---|---|
| Grafana Cloud Kubernetes Monitoring | $0.01 per pod-hour |
| Synthetic Monitoring | $3.00 per 1K checks |
| Frontend Observability (RUM equivalent) | $3.00 per 1K sessions |
| Grafana OnCall | Included in Pro |
| Adaptive Metrics (series reduction) | $0.20 per 1K active series reduced |
The Key Difference in Log Pricing
Grafana Cloud (powered by Loki) charges $0.50 per GB ingested, and that includes 30 days of retention. There is no separate indexing fee. Loki uses label-based indexing rather than full-text indexing, which makes storage dramatically cheaper at the infrastructure level, and Grafana passes those savings to customers.
Compare that to Datadog: $0.10/GB ingestion + $1.70-2.50 per million indexed events for retention. At high log volumes with millions of events, Datadog's effective per-GB cost for searchable logs can reach $2-5/GB. Grafana Cloud stays flat at $0.50/GB.
Head-to-Head Cost Modeling: 50, 200, and 500 Hosts
Abstract pricing tables are nice, but what actually matters is the total monthly bill at your scale. Let us model three realistic scenarios.
Scenario Assumptions (Applied to All Three)
- Infrastructure monitoring on all hosts
- APM/tracing on 60% of hosts (application servers)
- Log ingestion: 5GB/host/day average (moderate verbosity)
- 100 custom metrics per host beyond defaults
- Standard retention (15 days Datadog, 30 days Grafana)
- Annual billing (most favorable Datadog pricing)
50 Hosts
| Component | Datadog | Grafana Cloud |
|---|---|---|
| Infrastructure | 50 x $23 = $1,150 | $29 base + 50K series x $8/1K = $429 |
| APM/Traces | 30 x $40 = $1,200 | 150GB traces x $0.50 = $75 |
| Logs (250GB/day = 7.5TB/mo) | 7,500GB x $0.10 = $750 + retention ~$2,000 | 7,500GB x $0.50 = $3,750 |
| Custom Metrics | Included in Enterprise | Included in series count |
| Total | ~$5,100/month | ~$4,254/month |
| Annual | ~$61,200 | ~$51,048 |
At 50 hosts, the gap is modest: about 17% savings with Grafana Cloud. Datadog's higher per-host cost is partially offset by Grafana's higher per-GB log pricing at this volume. The real value proposition of Grafana Cloud shows up at larger scale.
200 Hosts
| Component | Datadog | Grafana Cloud |
|---|---|---|
| Infrastructure | 200 x $23 = $4,600 | $29 base + 200K series x $8/1K = $1,629 |
| APM/Traces | 120 x $40 = $4,800 | 600GB traces x $0.50 = $300 |
| Logs (1TB/day = 30TB/mo) | 30,000GB x $0.10 = $3,000 + retention ~$8,000 | 30,000GB x $0.50 = $15,000 |
| Custom Metrics | 20K extra x $0.05 = $1,000 | Included in series count |
| Synthetics (basic) | $200 | $150 |
| Total | ~$21,600/month | ~$17,079/month |
| Annual | ~$259,200 | ~$204,948 |
Wait. At this scale, Grafana Cloud's log pricing actually catches up because the per-GB rate is higher ($0.50 vs $0.10 ingestion). But Datadog's indexed retention fees push the total log cost much higher. The net result: Grafana Cloud saves about 21%, or $54,000 per year.
Let me be honest though: at this log volume, the smartest move is not choosing between these two platforms. It is reducing your log volume. If your 200 hosts are generating 1TB of logs daily, there is almost certainly 50-70% of that volume that provides zero operational value. Debug logs in production, duplicate access logs, health check noise. Fix the source before optimizing the sink.
500 Hosts
| Component | Datadog | Grafana Cloud |
|---|---|---|
| Infrastructure | 500 x $23 = $11,500 | $29 base + 500K series x $8/1K = $4,029 |
| APM/Traces | 300 x $40 = $12,000 | 1.5TB traces x $0.50 = $750 |
| Logs (2.5TB/day = 75TB/mo) | 75,000GB x $0.10 = $7,500 + retention ~$20,000 | 75,000GB x $0.50 = $37,500 |
| Custom Metrics | 50K extra x $0.05 = $2,500 | Included |
| RUM (500K sessions) | $750 | $1,500 |
| Database Monitoring (20 hosts) | 20 x $70 = $1,400 | OSS Postgres exporter (free) |
| Total | ~$55,650/month | ~$43,779/month |
| Annual | ~$667,800 | ~$525,348 |
At 500 hosts, the annual savings with Grafana Cloud is approximately $142,000. That is a meaningful number, enough to fund an entire engineer.
But notice something important: at this scale, logs dominate the bill on both platforms. On Grafana Cloud, log ingestion alone is $37,500/month. This is where teams need to get serious about log pipeline optimization: sampling, filtering at the collector level, routing low-value logs to cold storage instead of a full observability platform.
Where Datadog Wins (And Is Worth the Premium)
We are not going to pretend this is purely a cost decision. Datadog is more expensive for good reasons, and for some teams those reasons justify the cost.
Out-of-Box Experience
Datadog has 750+ integrations that work with minimal configuration. Install the agent, enable an integration, and you get pre-built dashboards, alerts, and correlation. Grafana Cloud has excellent integrations too, but you will spend more time configuring dashboards and alert rules yourself.
Unified Platform
Everything in Datadog lives in one interface: metrics, logs, traces, synthetics, RUM, security, CI visibility. Cross-correlation between logs and traces happens automatically. Grafana Cloud achieves this through multiple tools (Grafana + Loki + Tempo + Mimir), and while the experience is increasingly unified, it still requires more configuration to connect the dots.
Enterprise Features
Datadog's enterprise capabilities (RBAC, audit logging, compliance certifications, fine-grained access controls, custom retention policies) are mature and well-tested at Fortune 500 scale. Grafana Cloud is catching up but has not been in the enterprise market as long.
AI/ML Features
Datadog's Watchdog (anomaly detection) and AI-powered root cause analysis are genuinely useful for large, complex environments. These features are included in Enterprise plans and reduce mean-time-to-resolution for on-call engineers. Grafana Cloud has ML-based alerting but it is less mature.
When to Pay the Datadog Premium
- Your team is small and engineering time is more expensive than tooling costs
- You need 500+ integrations to work out of the box without custom configuration
- Compliance requirements mandate specific vendor certifications (SOC 2 Type II, HIPAA BAA, FedRAMP)
- You value a single vendor relationship for all observability needs
- Your primary constraint is mean-time-to-resolution, not cost
Where Grafana Cloud Wins (And Why Teams Are Switching)
Cost at Scale
The numbers above tell the story. At 200+ hosts, Grafana Cloud saves 20-40% annually. For companies where observability is a top-5 line item, that is tens or hundreds of thousands of dollars per year redirected to product engineering.
No Vendor Lock-In
This is Grafana's most strategic advantage. The entire stack is open source: Grafana, Mimir (Prometheus-compatible), Loki (LogQL), and Tempo (OpenTelemetry-native). If you ever want to leave Grafana Cloud, you can self-host the same tools on your own infrastructure. Try doing that with Datadog.
OpenTelemetry Native
Grafana Cloud is built around open standards. You instrument once with OpenTelemetry, and you can send that telemetry to any backend. Datadog supports OpenTelemetry too, but its native agent and proprietary instrumentation still provide a better experience within the Datadog ecosystem, which reinforces lock-in.
Flexible Data Tier Architecture
Grafana Loki (the log engine behind Grafana Cloud) does not index log content. It indexes labels only, which makes storage dramatically cheaper. For teams that do not need full-text search across every log line (and honestly, most teams do not), this architecture delivers 80% of the value at 20% of the cost.
When to Choose Grafana Cloud
- You are spending more than $10,000/month on observability and cost reduction is a priority
- Your team has platform engineering capacity to build and maintain dashboards
- You want to avoid vendor lock-in and value open standards (OpenTelemetry, PromQL, LogQL)
- You already use Prometheus, Grafana, or Loki in some capacity
- You have high log volumes (500GB+/day) where Datadog's indexing costs become punishing
The Hidden Costs Nobody Mentions
Both platforms have costs that do not appear in the headline pricing. Knowing these before you commit saves real money and real frustration.
Datadog Hidden Costs
-
Custom metrics overage: Each host includes 100-200 custom metrics. Kubernetes environments with service meshes routinely generate 500-1,000 metrics per pod. The overage fee of $0.05/metric/month adds up silently.
-
Container billing: Short-lived containers (CI jobs, cron tasks, batch processors) each count as a billable host for the fraction of the hour they run. Teams using Kubernetes with aggressive autoscaling often see 2-3x more "hosts" billed than physical nodes.
-
Log indexing surprises: Ingesting logs at $0.10/GB sounds cheap. But without indexed retention, those logs are not searchable. Adding 15-day indexed retention at $1.70/million events makes the effective log cost much higher than the ingestion price suggests.
-
Committed spend traps: Datadog offers discounts for annual commitments, but those commitments are use-it-or-lose-it. If your infrastructure scales down (cost optimization project, anyone?), you still pay the committed amount.
Grafana Cloud Hidden Costs
-
Engineering time: Grafana Cloud requires more upfront configuration than Datadog. Building dashboards, setting up alert rules, configuring recording rules for Mimir. Budget 2-4 weeks of platform engineering time for initial setup at scale.
-
Cardinality explosions: Mimir (the metrics backend) bills by active series count. A misconfigured label (like a request ID or timestamp in a metric label) can create millions of series overnight. Grafana provides cardinality management tools, but you need to proactively use them.
-
Plugin ecosystem gaps: While Grafana's plugin ecosystem is large, some Datadog integrations have no direct equivalent. You may need to build custom data source plugins or use Alloy (the Grafana agent) with manual configuration.
-
Log query performance at extreme scale: Loki's label-based indexing is cheap but slower for full-text searches across billions of log lines. If your team relies on searching arbitrary strings across all logs, you may need to configure chunk caching or accept slower query times.
Migration Strategies: How to Switch Without Breaking Production
If you are on Datadog and considering a move to Grafana Cloud (or vice versa), here is what the migration realistically looks like.
Phase 1: Dual-Ship (Weeks 1-2)
Run both platforms in parallel. Use the OpenTelemetry Collector as a routing layer that sends identical telemetry to both Datadog and Grafana Cloud simultaneously. This lets you validate data parity without touching any application instrumentation.
# OpenTelemetry Collector config - dual export
exporters:
datadog:
api:
key: ${DD_API_KEY}
otlphttp/grafana:
endpoint: https://otlp-gateway-prod-us-central-0.grafana.net/otlp
headers:
Authorization: "Basic ${GRAFANA_CLOUD_TOKEN}"
service:
pipelines:
metrics:
exporters: [datadog, otlphttp/grafana]
traces:
exporters: [datadog, otlphttp/grafana]
Phase 2: Dashboard Recreation (Weeks 2-4)
Recreate your most critical dashboards in Grafana. Start with the dashboards your on-call team actually uses daily (usually 5-10 dashboards cover 80% of incident response). Do not try to migrate every dashboard. Many of them were created once and never looked at again.
Phase 3: Alert Migration (Weeks 3-5)
Migrate alerting rules one service at a time. Keep Datadog alerts active as a safety net until Grafana alerts have proven reliable for at least one full on-call rotation.
Phase 4: Cutover and Decommission (Week 6+)
Once confidence is high, disable the Datadog exporter, tear down the Datadog agents, and cancel the contract. Budget one month of overlap for safety.
Total migration time: 4-8 weeks for a 200-host environment with a dedicated platform engineer. Larger environments or teams with heavy Datadog customization should budget 8-12 weeks.
Our Recommendation: The Decision Framework
After helping dozens of teams optimize their observability costs, here is the framework we use at LeanOps to recommend the right platform:
Choose Datadog if:
- You spend less than $5,000/month on observability (the cost difference is not worth the migration effort)
- Your team has fewer than 3 platform engineers (Datadog's out-of-box value saves engineering time)
- You need compliance certifications that Grafana Cloud does not yet offer
- You heavily use Datadog-specific features like Watchdog AI or notebook investigations
Choose Grafana Cloud if:
- You spend more than $10,000/month on observability and want to reduce that by 20-40%
- You have platform engineering capacity to configure and maintain the stack
- You want to avoid vendor lock-in and are investing in OpenTelemetry
- Your log volume exceeds 500GB/day (where Datadog's indexing costs become punishing)
- You already run Prometheus, Grafana, or Loki internally
Choose a hybrid approach if:
- You want Datadog APM for critical services but Grafana Cloud for infrastructure metrics and logs
- You are migrating gradually and want to reduce costs without a big-bang cutover
The Bottom Line
Observability is a genuine necessity, not a place to cut corners. But there is a wide gap between "adequate observability" and "we are spending $50,000/month on Datadog because nobody audited what we actually use."
At 200 hosts, switching from Datadog to Grafana Cloud saves roughly $54,000/year. At 500 hosts, the savings exceeds $140,000/year. Those are real numbers that fund real engineering headcount.
But the biggest savings often come not from switching platforms, but from reducing what you send to either platform. Filtering noisy logs at the collector, reducing metric cardinality, sampling traces for high-volume services. We have seen teams cut their observability bill by 50% without changing vendors, simply by being intentional about what data they actually need.
If your observability costs have grown beyond what feels reasonable, start with a free Cloud Waste Assessment. We will look at your full cloud bill, including monitoring spend, and show you exactly where the money goes and what you can realistically save within 90 days.
Further reading:
