Vector databases have become a central component of AI application stacks, particularly for Retrieval-Augmented Generation (RAG). Here is what they actually are and the situations where they genuinely help.
What a Vector Database Is
A vector database stores and searches data represented as numerical vectors (embeddings) in high-dimensional space. The core operation is not “find this exact record” (like a relational database) or “find records containing this keyword” (like a full-text search engine) but “find the records most similar to this query vector.” How vectors get created: an embedding model (OpenAI text-embedding-3-small, Cohere embed-v3, Google text-embedding-004) converts text (or images, audio) into a numerical vector of typically 384–3072 dimensions. Semantically similar content produces similar vectors — “automobile” and “car” produce vectors close in space, even though they share no characters. The similarity search: vector databases use approximate nearest-neighbour (ANN) algorithms — HNSW (Hierarchical Navigable Small World) is the most common — to find the K vectors closest to a query vector. The “approximate” is important: exact nearest-neighbour search requires comparing every stored vector to the query vector (O(n) time), which is too slow for large datasets. ANN trades a small accuracy loss for orders-of-magnitude speed improvement. Why this matters for AI: language models have fixed context windows — you cannot stuff an entire knowledge base into the prompt. RAG solves this by storing knowledge as embeddings, retrieving the most relevant chunks at query time, and including only those in the prompt. The vector database is the retrieval engine.
The Major Options
Pinecone: managed cloud vector database — no infrastructure to manage; strong managed offering but proprietary. Weaviate: open-source, can be self-hosted or cloud; built-in hybrid search (vector + BM25 keyword). Qdrant: open-source, Rust-based, fast; strong filtering capabilities; can be self-hosted or cloud. Chroma: open-source, Python-native, designed for embedding into applications — popular for prototyping and smaller deployments. Milvus: open-source, designed for large-scale production; can handle billions of vectors. pgvector: a PostgreSQL extension that adds vector similarity search — the practical choice when you already use PostgreSQL and don’t want to add another infrastructure component. Redis Vector (formerly Redis Stack): adds vector search to Redis — appropriate if Redis is already in your stack. The pgvector case: for most small-to-medium RAG applications, pgvector is the pragmatic choice. You get vector search in your existing Postgres database without a new service, with full SQL filtering, and at no additional infrastructure cost. The dedicated vector database case: when you need: billion-scale vector storage; sub-millisecond latency at very high query volumes; managed infrastructure without self-hosting Postgres; or advanced ANN tuning options that pgvector doesn’t expose.
When a Vector Database Actually Helps (and When It Doesn’t)
Clear wins: semantic search over large document collections (internal wikis, customer support knowledge bases, legal documents); product recommendation based on semantic similarity rather than collaborative filtering; de-duplication of content at scale (finding near-duplicate documents); and the retrieval component of RAG systems. Where it adds complexity without benefit: if your dataset has fewer than 100,000 records, a full-text search engine (Elasticsearch, Postgres full-text search) or even a simple LIKE query may be entirely sufficient; if your search is primarily keyword-based and not semantic, inverted index search is faster and more precise than vector search; if you just need to chat with a single document, chunking and stuffing it into context is simpler than building a retrieval pipeline. The common mistake: adding a vector database because it’s modern, not because the problem requires semantic similarity search. Semantic similarity is genuinely powerful for the right problems; it is worse than keyword search for exact lookups and structured queries.




