Infrastructure

Vector database

Specialized database that stores and queries vector embeddings using semantic similarity; the backbone of modern retrieval-augmented generation.

A vector database is a system for storing and querying high-dimensional vectors — typically embeddings produced by a neural network from text, images, audio, or code. Instead of classical exact-match or keyword search, a vector database returns items that are semantically similar to the query under metrics such as cosine similarity or Euclidean distance.

Technically, the core is approximate nearest neighbor (ANN) search backed by indexes like HNSW, IVF, ScaNN, or DiskANN. Without ANN, searching across millions of vectors would be far too slow for interactive applications.

Why it matters in 2024–2026:

  • Vector databases are the foundational building block of retrieval-augmented generation (RAG) — an LLM fetches relevant documents from the store before generating an answer
  • They power semantic search, recommendation systems, duplicate detection, and “search by meaning” over images
  • The market includes dedicated systems (Pinecone, Weaviate, Qdrant, Milvus, Chroma) and extensions of existing databases (pgvector for PostgreSQL, Atlas Vector Search for MongoDB, Redis Vector)

Key trade-offs when choosing one: managed vs. self-hosted, query performance vs. cost, hybrid search (vector + keyword), metadata filtering, and update/delete support. For most RAG projects pgvector is a solid first choice; specialized vector stores pay off at tens of millions of vectors and tight latency budgets.

Sources

See also