Foundations
Embedding
A vector representation of a word, sentence, or document in a high-dimensional space where semantically similar items have nearby vectors.
An embedding is a dense floating-point vector (typically 256 to 4096 dimensions) that represents the meaning of a word, sentence, paragraph, image, or any other input. The key property: items with similar meaning sit close to each other in that space, while unrelated items are far apart. Distance is usually measured with cosine similarity or Euclidean distance.
Embeddings are produced by purpose-trained models — for example, OpenAI text-embedding-3-large, Cohere Embed v3, or open-weight models like bge-m3 and nomic-embed. Many large language models use embeddings internally as their first layer after tokenization — each token is mapped to a learned vector before entering the transformer layers.
Main applications:
- Semantic search: instead of matching exact words, the system finds documents with similar meaning
- RAG pipelines: retrieving relevant documents from a vector database before generating an answer
- Classification and clustering: grouping similar content without manual labeling
- Recommendations: “users who watched X may want Y”
Embeddings are the foundation of every modern semantic search system and RAG architecture — without them, AI assistants could not efficiently access their own documentation or knowledge beyond the conversation context.