Embedding

An embedding is a dense floating-point vector (typically 256 to 4096 dimensions) that represents the meaning of a word, sentence, paragraph, image, or any other input. The key property: items with similar meaning sit close to each other in that space, while unrelated items are far apart. Distance is usually measured with cosine similarity or Euclidean distance.

Embeddings are produced by purpose-trained models — for example, OpenAI text-embedding-3-large, Cohere Embed v3, or open-weight models like bge-m3 and nomic-embed. Many large language models use embeddings internally as their first layer after tokenization — each token is mapped to a learned vector before entering the transformer layers.

Main applications:

Semantic search: instead of matching exact words, the system finds documents with similar meaning
RAG pipelines: retrieving relevant documents from a vector database before generating an answer
Classification and clustering: grouping similar content without manual labeling
Recommendations: “users who watched X may want Y”

Embeddings are the foundation of every modern semantic search system and RAG architecture — without them, AI assistants could not efficiently access their own documentation or knowledge beyond the conversation context.

Sources

See also