How much does Weaviate Cloud cost in 2026?

Weaviate Cloud Serverless charges $0.095/AU-hour for compute, $0.035/GB/month for hot storage, and $0.0125/GB/month for warm storage. For 1M vectors at 1536 dimensions, expect $45-80/month. The free Sandbox supports 50K objects with no time limit.

Is Weaviate cheaper than Pinecone in 2026?

For pure vector search, Pinecone Serverless is 20-40% cheaper. For multimodal workloads using Weaviate's built-in vectorizers, Weaviate is cheaper since you avoid separate embedding API costs. At 10M vectors, Weaviate runs $200-400/month vs Pinecone at $170-370/month.

Does Weaviate have a free tier?

Yes. The free Sandbox has no time limit, supports 50,000 objects, includes 1 node, and provides access to all modules including generative search and built-in vectorizers. It is rate-limited and best for prototyping RAG applications.

Should I use Weaviate Cloud or self-host Weaviate?

Self-hosted on a $60-120/month VM handles 5-10M vectors at 50-70% lower cost. Choose Cloud for automatic scaling, backups, and zero ops. The break-even is ~5M vectors: below that, Cloud convenience is worth the premium.

How does Weaviate Cloud pricing compare to Qdrant Cloud?

Qdrant Cloud is 30-50% cheaper for pure vector search. A 10M vector deployment costs $120-180/month on Qdrant vs $200-400/month on Weaviate. However, Weaviate includes built-in vectorization, generative search, and hybrid BM25+vector search at no extra cost.

How much can Product Quantization save on Weaviate Cloud costs?

Scalar Quantization (SQ) reduces memory 4x with under 1% recall loss. Product Quantization (PQ) reduces memory 8-32x with 2-5% recall loss. At 10M vectors, SQ cuts compute needs by 60-75%, reducing a $290/month bill to $120-150/month.

How much does it cost to self-host Weaviate at 20M+ vectors?

At 20-100M vectors, a 3-node cluster costs $350-750/month vs $500-1,700/month on Cloud. At 100M+, a 5+ node cluster runs $800-2,000/month vs $1,500-4,000+ on Cloud. Factor in 4-8 hours/month of ops maintenance at $150/hour.

Back to Engineering Insights

Cloud Cost Optimization

Apr 12, 2026

By Ravi Kanani

Weaviate vs Pinecone vs Qdrant 2026: Price, Latency, and the Built-In Vectorizer Edge

Key Takeaway

Weaviate Cloud Serverless charges $0.095 per AU-hour (Activity Unit) in 2026, with storage at $0.035/GB/month for hot and $0.0125/GB/month for warm. For 1 million vectors at 1536 dimensions, Weaviate Serverless costs roughly $45-80/month depending on query volume. At 10M vectors with moderate traffic, expect $200-400/month. The free Sandbox is unlimited duration but limited to 50K objects. Weaviate is cheapest for multimodal workloads that use built-in vectorizers, most expensive for pure vector similarity search at high QPS.

The Vector Database That Does Everything Charges Like It, Too

Weaviate is the Swiss Army knife of vector databases. It vectorizes your data (text, images, audio), stores the vectors, does hybrid BM25+vector search, and even runs generative queries (RAG) natively. No separate embedding API. No external reranker. No Elasticsearch sidecar for keyword search.

That is genuinely impressive. But there is a catch: all that functionality means pricing is more complex than Pinecone's "pay per read unit" or Qdrant's "pay per GB of RAM." Weaviate Cloud charges in Activity Units (AU-hours), storage tiers (hot vs warm), and module usage. Predicting your monthly bill requires understanding how these pieces interact.

We have deployed Weaviate across several client environments at LeanOps, particularly for teams building RAG applications that need hybrid search. The typical pattern: a team evaluates Pinecone for pure vector search, realizes they also need keyword filtering and built-in embeddings, and switches to Weaviate. The question is always "what will this actually cost at our scale?"

This post answers that question with real numbers, honest comparisons, and a clear framework for deciding whether Weaviate Cloud, self-hosted Weaviate, or a competitor is the right choice for your workload.

Weaviate Cloud Pricing in 2026: Complete Breakdown

Weaviate Cloud uses a consumption-based model with three pricing components: compute (AU-hours), storage (GB/month), and optional add-ons.

Deployment Options

Plan	Target	Key Features	Starting Cost
Sandbox (Free)	Prototyping	50K objects, 1 node, all modules	$0
Serverless	Production	Auto-scaling, multi-tenant, managed	$0.095/AU-hour
Enterprise Dedicated	High-scale	Isolated infra, SLA, custom config	Custom pricing
Bring Your Own Cloud (BYOC)	Compliance	Runs in your VPC, Weaviate-managed	Custom pricing

Serverless Compute Pricing

Component	Rate	Notes
Activity Units (compute)	$0.095/AU-hour	Scales with query complexity and volume
Minimum AU	0 (scales to near-zero when idle)	You pay only for actual compute used
Vectorization (built-in)	Included in AU cost	text2vec, img2vec, multi2vec modules
Generative (RAG)	Included in AU cost	Plus downstream LLM API costs

How Activity Units work: An AU measures compute consumption. A simple vector similarity search on 1M vectors might consume 0.001 AU. A complex hybrid query with filtering, reranking, and generative output on 50M vectors might consume 0.05 AU. The rate is $0.095 per AU-hour, meaning you pay for sustained compute capacity, not per-query.

The practical implication: idle clusters cost very little. Bursty workloads (dev environments, periodic batch jobs) are much cheaper than steady high-QPS production workloads.

Storage Pricing

Storage Tier	Rate	Use Case	Retrieval Speed
Hot storage	$0.035/GB/month	Frequently accessed collections	Instant
Warm storage	$0.0125/GB/month	Infrequently accessed data	Slightly delayed first access
Backups	$0.02/GB/month	Automated daily backups	Restore within minutes

Storage math for vectors:

1M vectors at 1536 dimensions (float32): roughly 6.1 GB
Hot storage cost for 1M vectors: 6.1 GB x $0.035 = $0.21/month
Warm storage cost for 1M vectors: 6.1 GB x $0.0125 = $0.08/month
10M vectors at 1536 dimensions: roughly 61 GB
Hot storage cost for 10M vectors: 61 GB x $0.035 = $2.14/month

Storage is not the expensive part of Weaviate. Compute (AU-hours) is.

Free Sandbox Details

Feature	Limit
Objects (vectors)	50,000
Nodes	1
Collections	Unlimited
Modules	All (vectorizers, generative, rerankers)
Duration	Permanent (no expiration)
Rate limit	Moderate (suitable for dev, not production)
Regions	1
Backups	Not included
Multi-tenancy	Supported

The Sandbox is genuinely useful for development. Unlike Pinecone's free tier (which gives you 2GB of storage and real production capacity), Weaviate's Sandbox is more like a persistent dev environment. It lets you validate your schema, test queries, and prototype RAG flows. For small production apps with under 50K objects and low traffic, it works as a free production tier too.

Additional Costs

Feature	Cost
Cross-region replication	2x base cost
Custom modules	Included (deploy your own vectorizer)
Authentication (API key, OIDC)	Included
Monitoring (built-in metrics)	Included
Priority support	Enterprise plan only
VPC peering	BYOC/Enterprise only

Real-World Cost Modeling: What Weaviate Actually Costs

Let us model costs at three realistic scales. These assume 1536-dimension vectors (OpenAI embedding size), moderate query complexity, and hot storage.

Scenario 1: Small RAG App (1M Vectors, Low Traffic)

A typical setup: a documentation search or internal knowledge base with 1M chunks, queried a few hundred times per day.

Component	Calculation	Monthly Cost
Storage (hot)	6.1 GB x $0.035	$0.21
Compute (low AU usage)	~5 AU-hours/day x 30 x $0.095	$14.25
Vectorization (if using built-in)	Included	$0
Backups	6.1 GB x $0.02	$0.12
Total		~$15-45/month

The range is $15-45 because compute scales with actual query volume and complexity. A few hundred simple searches per day sits at the low end. If you add generative queries (RAG with LLM calls), the compute increases, plus you pay the downstream LLM provider.

Comparison at this scale:

Pinecone Serverless: $12-35/month (cheaper for pure vector search)
Qdrant Cloud: $25-45/month (1GB free tier covers small workloads)
Self-hosted Weaviate: $20-40/month (on a small VM, but you manage it)

Scenario 2: Production Search (10M Vectors, Moderate Traffic)

A product catalog search, content recommendation engine, or customer support RAG with 10M vectors, queried 10,000-50,000 times per day.

Component	Calculation	Monthly Cost
Storage (hot)	61 GB x $0.035	$2.14
Compute (moderate AU)	~50-100 AU-hours/day x 30 x $0.095	$142-285
Backups	61 GB x $0.02	$1.22
Total		~$150-290/month

At this scale, compute dominates. Storage is essentially free. The question becomes: how compute-intensive are your queries?

Comparison at this scale:

Pinecone Serverless: $170-370/month (storage $120 + queries $50-250)
Qdrant Cloud: $120-180/month (Medium or Large cluster)
Self-hosted Weaviate: $80-150/month (on a $120 VM with 32GB RAM)

Scenario 3: Large-Scale Multimodal (50M Vectors, High Traffic)

An e-commerce visual search, enterprise document intelligence, or multi-tenant SaaS with 50M vectors and 100,000+ queries per day.

Component	Calculation	Monthly Cost
Storage (hot)	305 GB x $0.035	$10.68
Compute (high AU)	~300-600 AU-hours/day x 30 x $0.095	$855-1,710
Backups	305 GB x $0.02	$6.10
Total		~$870-1,730/month

Comparison at this scale:

Pinecone Serverless: $800-1,500/month (storage + high read unit volume)
Pinecone Pods (p2): $2,400-4,800/month (dedicated, low latency)
Qdrant Cloud: $500-1,000/month (XLarge clusters or custom)
Self-hosted Weaviate: $300-600/month (on 3-node cluster, 96GB+ RAM)

At 50M+ vectors, the self-hosted option saves 60-70%. Whether that savings justifies the DevOps investment depends on your team.

Weaviate vs Pinecone vs Qdrant: The Real Cost Comparison

Everyone wants a simple answer: "which vector database is cheapest?" The honest answer is: it depends on what you are doing with it.

Cost Comparison at 10M Vectors (1536-dim, Moderate Traffic)

Factor	Weaviate Cloud	Pinecone Serverless	Qdrant Cloud
Monthly cost	$150-290	$170-370	$120-180
Storage model	$0.035/GB hot	$2.00/GB	Bundled with cluster
Compute model	AU-hours	Per read/write unit	Fixed cluster size
Built-in vectorization	Yes (free)	No	No
Hybrid search (BM25+vector)	Yes (native)	No (metadata filters only)	No (requires workaround)
Generative/RAG	Native module	No	No
Scale-to-zero	Near-zero AU when idle	Yes (truly zero)	No (cluster always runs)
Free tier	50K objects, permanent	2GB storage, permanent	1GB cluster, permanent

When Weaviate Wins on Total Cost

Multimodal workloads: If you need text + image + audio search in one database, Weaviate's built-in vectorizers save you $50-200/month in separate embedding API costs.
RAG applications: Weaviate's generative module means you don't need a separate orchestration layer (LangChain, LlamaIndex) for simple RAG. Fewer moving parts = lower ops cost.
Hybrid search requirements: If you need BM25 keyword search alongside vector similarity (most real-world search does), Weaviate is one system instead of two. No Elasticsearch sidecar.
Multi-tenant SaaS: Weaviate's native multi-tenancy means one cluster serves many customers with data isolation. Pinecone requires separate namespaces or indexes per tenant.

When Weaviate Loses on Cost

Pure vector similarity at high QPS: If all you need is "find the 10 nearest vectors" at 10,000+ QPS, Qdrant and Pinecone are both cheaper and faster.
Small, stable workloads: For under 1M vectors with steady traffic, Pinecone Serverless pay-per-query model is often cheapest.
Cost predictability: AU-hour pricing makes it harder to predict exact monthly costs. Qdrant's fixed cluster pricing and Pinecone's per-unit pricing are easier to forecast.

Vectorizer Cost Savings: The Math

The frontmatter of this post mentions $50-200/month in savings from Weaviate's built-in vectorizers. Here is the full calculation so you can model it for your own workload.

Without built-in vectorizers (Pinecone/Qdrant path):

You need an external embedding API for both ingestion and real-time query embedding.

Ingestion (one-time): 10M documents x avg 500 tokens/doc x $0.13/1M tokens (OpenAI text-embedding-ada-002) = $650 one-time
Query embedding (ongoing): 500K queries/day x 30 days x 100 tokens avg x $0.13/1M tokens = $195/month
Total embedding API cost: $195/month ongoing (plus $650 every time you re-embed your corpus)

With Weaviate built-in vectorizers:

All vectorization happens on-cluster hardware (included in your AU-hour cost)
Additional embedding API cost: $0/month
Monthly savings at 500K queries/day: $195/month

At higher query volumes:

Daily Query Volume	Monthly Embedding API Cost (External)	Savings with Weaviate Built-In
100K queries/day	$39/month	$39/month
500K queries/day	$195/month	$195/month
2M queries/day	$780/month	$780/month
5M queries/day	$1,950/month	$1,950/month
10M queries/day	$3,900/month	$3,900/month

The vectorizer edge scales linearly with query volume. At 5M+ queries/day, it saves $1,950+/month — exceeding the cluster cost itself. For high-traffic applications like e-commerce search or customer-facing AI assistants, the embedding savings alone justify choosing Weaviate over competitors that require external vectorization.

Note: if you use Weaviate's API-based vectorizer modules (text2vec-openai, text2vec-cohere), you still pay the external API. The savings only apply when using on-cluster modules like text2vec-transformers or text2vec-contextionary.

When Weaviate Cloud Is the Wrong Choice

No database wins every scenario. Here are cases where Weaviate Cloud is genuinely the wrong pick — choosing it anyway wastes money.

1. Pure vector similarity with < 5M vectors and < 100K queries/day → Pinecone Serverless is 40% cheaper.

If your application is strictly "embed query, find K nearest vectors, return results" with no hybrid search, no generative features, and no multimodal requirements, Pinecone Serverless' pay-per-read-unit model is leaner. At 5M vectors with 100K queries/day, expect $70-120/month on Pinecone vs $100-200/month on Weaviate. The 40% gap exists because Weaviate's AU-hour pricing includes capacity for features you are not using.

2. Team already running Elasticsearch for keyword search → adding Qdrant alongside is simpler than replacing both with Weaviate.

Weaviate's hybrid search value proposition assumes you are choosing a single system for both keyword and vector search. If you already have a tuned Elasticsearch cluster handling BM25 search with custom analyzers, synonyms, and business rules, replacing it with Weaviate's BM25 implementation means re-building that configuration. Adding Qdrant as a dedicated vector index alongside your existing Elasticsearch is less disruptive and often cheaper in total (Qdrant at $120-180/month + your existing ES cluster vs migrating everything to Weaviate).

3. Latency-critical (< 5ms p99) at high QPS → Qdrant's HNSW implementation is faster for pure vector ops.

Weaviate's query path involves more processing stages (module resolution, schema validation, filter application) even for simple vector queries. At high QPS where every millisecond matters, Qdrant's streamlined HNSW implementation delivers 2-4ms p99 latency for pure vector search vs Weaviate's 8-15ms p99 for equivalent workloads. If your SLA demands sub-5ms p99 at 10K+ QPS, Qdrant is the better engine.

4. Budget under $100/month → Self-hosted Qdrant on a $40 VM beats any managed option.

For early-stage projects, prototypes, or internal tools where the vector count stays under 2-3M and query volume is low, a $40/month VM (4GB RAM, 2 vCPU) running Qdrant handles the workload comfortably. Weaviate's minimum practical Cloud cost ($45-80/month for 1M vectors) approaches this, but self-hosted Qdrant on minimal hardware is genuinely cheaper and simpler when you do not need Weaviate's advanced features.

Weaviate Cloud vs Self-Hosted: The Break-Even Analysis

Self-hosted Weaviate is the same software (it is open source under BSD-3). The question is whether managed convenience is worth the markup.

Self-Hosted Cost Model

Scale	Infrastructure	Monthly Cost	Compared to Cloud
1-5M vectors	Single VM (16GB RAM, 4 vCPU)	$60-80/month	Cloud: $45-150
5-20M vectors	Single VM (32-64GB RAM)	$120-250/month	Cloud: $150-500
20-100M vectors	3-node cluster (32GB+ each)	$350-750/month	Cloud: $500-1,700
100M+ vectors	5+ node cluster	$800-2,000/month	Cloud: $1,500-4,000+

What Self-Hosting Requires

The infrastructure cost looks attractive, but self-hosting adds operational overhead:

Upgrades: Weaviate releases monthly. Staying current matters for performance and security.
Backups: You configure and monitor backup jobs. A corrupted index without backup = data loss.
Scaling: Adding nodes to a running cluster requires rebalancing. Not trivial.
Monitoring: You set up Prometheus + Grafana (or similar) for query latency, memory pressure, and disk usage.
Security: API key management, network policies, TLS certificates.
High availability: Multi-node setup with replication factor 2+ for production.

The break-even formula: If your engineering time costs $150/hour and self-hosting requires 4-8 hours/month of maintenance, the operational cost is $600-1,200/month. Add that to infrastructure and compare against Weaviate Cloud.

For teams with existing Kubernetes clusters and DevOps capacity, the marginal cost of running Weaviate is minimal. For teams without that infrastructure, Weaviate Cloud is almost always cheaper when you factor in people cost.

Weaviate Cloud Cost Optimization: 7 Strategies

1. Use Warm Storage for Archival Collections

If you have collections that are rarely queried (older documents, historical data), move them to warm storage at $0.0125/GB vs $0.035/GB hot. That is a 64% reduction in storage costs.

2. Choose the Right Vectorizer Module

Weaviate includes several built-in vectorizer options:

Module	Speed	Quality	Cost Impact
text2vec-contextionary	Fast	Good	Low AU consumption
text2vec-transformers	Slow	Excellent	Higher AU on ingestion
text2vec-openai	Fast (API)	Excellent	AU + external API cost
text2vec-cohere	Fast (API)	Excellent	AU + external API cost

If you use an external API vectorizer (OpenAI, Cohere), you pay both Weaviate AU-hours AND the external API. For cost optimization, consider running a local transformer model if your quality requirements allow it.

3. Optimize Query Patterns

Reduce AU consumption by:

Using limit to cap result counts (don't fetch 100 results if you display 10)
Adding filters before vector search (reduces search space)
Using nearText instead of nearVector when possible (avoids client-side embedding)
Batching writes (single batch insert vs many individual inserts)

4. Enable Compression (PQ/SQ)

Weaviate supports Product Quantization (PQ) and Scalar Quantization (SQ) to reduce memory footprint:

Scalar Quantization: Reduces memory by 4x with minimal recall loss (~1%)
Product Quantization: Reduces memory by 8-32x with moderate recall loss (2-5%)

At 10M vectors, enabling SQ can reduce your effective storage and compute needs by 60-75%.

5. Use Multi-Tenancy for SaaS Workloads

If you serve multiple customers from one Weaviate instance, native multi-tenancy:

Isolates data per tenant (security)
Allows tenant-level offloading to warm storage (cost)
Avoids provisioning separate clusters per customer

A single Weaviate Cloud cluster serving 100 tenants is dramatically cheaper than 100 separate deployments.

6. Right-Size Your Sandbox for Development

The free Sandbox handles 50K objects. If your dev dataset is larger, consider:

Sampling a representative subset for development
Using the Sandbox for schema and query testing, then deploying to Serverless
Deleting and recreating Sandboxes when testing different schemas

7. Monitor AU Consumption Actively

Weaviate Cloud provides usage metrics. Set up alerts when AU consumption exceeds your budget threshold. Common sources of unexpected AU spikes:

Batch re-indexing (rebuilding HNSW index)
Unfiltered searches on large collections
Generative queries with large context windows
Frequent schema changes triggering background operations

When to Choose Weaviate Cloud (Decision Framework)

Choose Weaviate Cloud Serverless If:

You need hybrid search (keyword + vector) in one database
You are building a RAG application and want native generative support
Your workload is bursty (scales to near-zero when idle)
You need multimodal search (text + images)
You want multi-tenancy for a SaaS product
Your team does not have dedicated DevOps for database management

Choose Self-Hosted Weaviate If:

You have an existing Kubernetes cluster with available capacity
Your vector count exceeds 20M and cost is the primary concern
You need to run in a specific VPC for compliance (or cannot afford BYOC pricing)
Your team has DevOps expertise and capacity for ongoing maintenance
You want maximum control over performance tuning

Choose Pinecone Instead If:

You only need pure vector similarity search (no hybrid, no generative)
Your traffic is extremely variable and you want true scale-to-zero
You value the simplest possible API and pricing model
Your vectors are pre-computed externally and you just need storage + retrieval

Choose Qdrant Instead If:

Cost is your primary concern and you want open source with cheaper managed cloud
You need high QPS at low latency for pure vector search
You plan to self-host eventually and want an easy migration path
Your workload is read-heavy with infrequent writes

For more on vector database cost comparisons, see our Pinecone pricing breakdown and Qdrant Cloud pricing guide.

The Bottom Line

Weaviate Cloud is not the cheapest vector database. It is not trying to be. It is the most feature-complete, and its pricing reflects the value of having hybrid search, built-in vectorization, generative RAG, and multi-tenancy in a single managed service.

For teams building production RAG applications, the total cost of ownership often favors Weaviate despite higher per-unit compute pricing. The alternative (Pinecone + Elasticsearch + embedding API + orchestration layer) has more moving parts, more failure modes, and frequently costs more in aggregate.

For teams that only need "store vectors, find similar vectors," Qdrant Cloud at $120-180/month for 10M vectors is hard to beat.

If your AI infrastructure costs are growing faster than your usage, our team at LeanOps specializes in vector database cost optimization and AI infrastructure right-sizing. We typically cut AI infra costs by 40-60% within 60 days. Get a free Cloud Waste Assessment to see where your money is going.

Further reading:

Frequently Asked Questions

Stop Overpaying for Cloud Infrastructure

Our clients save 30-60% on their cloud bill within 90 days. Get a free Cloud Waste Assessment and see exactly where your money is going.

Free Cloud Waste Assessment Our Services

Related Insights

View All

Cloud Cost Optimization

May 10, 2026

Cloudflare R2 vs AWS S3 in 2026: When R2 Saves 60% and When S3 Still Wins

After migrating 14 client workloads from AWS S3 to Cloudflare R2, we found that zero egress saves 60%+ in read-heavy use cases but S3 wins on lifecycle policies, compliance, event-driven architectures, and multi-region replication. This decision framework maps your workload pattern to the right provider.

Cloud Cost Optimization

May 8, 2026

Terraform Drift Is Silently Adding $8K-40K/Year to Your Cloud Bill (Here's How to Find It)

Infrastructure drift between your Terraform state and actual cloud resources creates invisible cost leaks. Manual changes, failed destroys, and console clicks accumulate resources that no one owns, no one monitors, and no one deletes. We break down the 7 most expensive drift patterns, show you how to detect them in under 10 minutes, and provide the exact remediation playbook.

Cloud Cost Optimization

May 7, 2026

80% of GPU Spend Wasted: 3 K8s Changes That Cut AI Infra Bills 60% in 2 Weeks

Learn how to achieve cloud cost optimization for AI and machine learning workloads on Kubernetes using right-sizing, Karpenter, Spot strategies, and scale-to-zero techniques to modernize infrastructure and reduce cloud waste.

View All Insights