Glossary
Glossary
AI terminology, defined concisely.
73 terms
A
- A2A protocol (Agent2Agent) Open protocol introduced by Google in 2025 for interoperability and communication between AI agents built on different frameworks and by different vendors.
- Agent orchestration Coordinating multiple AI agents, tools, and steps into one workflow — via planners, routers, and frameworks like LangGraph — to solve complex tasks reliably.
- Agentic AI AI systems that autonomously plan and execute multi-step tasks using tools, memory, and loops, going beyond a single-turn chat response.
- AI accelerator (NPU/TPU) Specialized chip for AI workloads — NPUs in phones, Google TPUs, AWS Trainium — often faster and more cost-efficient than GPUs per dollar spent.
- AI Act Article 50 (transparency) Article 50 of the EU AI Act sets transparency rules — chatbots, deepfakes, and AI-generated content must be clearly disclosed; enforcement begins Aug 2026.
- AI Agent An LLM-powered system that pursues a goal autonomously by planning, calling tools, and iterating on its own output until the task is complete.
- AI alignment The research field that aims to ensure AI systems follow human intent, values, and safety goals rather than pursuing unintended objectives.
- AI Evaluation The discipline of measuring an AI model's capability, safety, and alignment through benchmarks, human review, and red-teaming before and after release.
- AI safety Broader field covering technical, organizational, and policy risks of AI systems — from mistakes and misuse to longer-term existential concerns.
- Artificial General Intelligence Hypothetical AI that matches or surpasses human capabilities across virtually all cognitive tasks, in contrast to today's narrow, task-specific AI.
- Attention mechanism A neural network technique that lets a model weigh the relevance of each input token to every other, forming the core of modern transformers.
B
C
- Chain-of-Thought A technique where a language model writes out a series of intermediate reasoning steps before its final answer, sharply improving accuracy on complex, multi-step tasks.
- Chatbot A software agent that holds a conversation with a user via text or voice; modern chatbots are powered by large language models and tool integrations.
- Claude A family of large language models built by Anthropic with a focus on safety, long context, and tool use; powers Claude.ai and the Claude Code agent.
- Constitutional AI Anthropic's method for aligning models using a written set of principles (a constitution) plus AI feedback (RLAIF), rather than human labels of harmful outputs.
- Context window The maximum number of tokens an LLM can process at once — including prompt, documents, and answer; today ranges from 8K to 2 million tokens.
D
- Deep learning A branch of machine learning that uses multi-layered neural networks to learn complex patterns; powers modern vision, speech, and language AI systems.
- Deepfake Synthetic media — fake video, audio, or images of a person generated with deep learning; a key misinformation, fraud, and safety concern in 2025-2026.
- Diffusion model A class of generative models that learn to reverse a gradual noising process; the dominant approach for AI-generated images, video, and audio today.
E
- Embedding A vector representation of a word, sentence, or document in a high-dimensional space where semantically similar items have nearby vectors.
- Emergent abilities Capabilities absent in smaller models that appear abruptly at scale; a contested claim — critics attribute the sharp jumps to the choice of nonlinear metrics.
- EU AI Act The European Union's regulation governing AI systems by risk tier (unacceptable, high, limited, minimal); the world's first comprehensive AI law, with enforcement phasing in 2024-2027.
F
- Fine-tuning The process of further training a pre-trained language model on a smaller, task-specific dataset to specialize its behavior or domain knowledge.
- Foundation model A large model trained on broad data that can be adapted to many tasks; Stanford CRFM term covering LLMs, vision models, and multimodal systems.
- Frontier model The largest, most capable general-purpose AI models at the cutting edge (GPT-5, Claude Opus, Gemini); the focus of frontier safety and regulation debates.
- Function calling A structured mechanism by which a large language model returns a call to a developer-defined function with arguments instead of text, which the application then executes.
G
- Generative Pre-trained Transformer (GPT) A family of decoder-only transformer language models pretrained on vast text and fine-tuned to follow instructions; the architecture behind ChatGPT and peers.
- Google Gemini A family of multimodal foundation models from Google DeepMind handling text, images, audio, and video; powers Gemini app, Workspace, and Vertex AI.
- GPAI (General-Purpose AI) EU AI Act category for general-purpose AI models (e.g. GPT, Claude, Gemini) with broad capabilities; documentation and transparency duties apply from August 2025.
- Graphics Processing Unit (GPU) Graphics Processing Unit with thousands of parallel cores; today the dominant hardware for training and serving AI models, led by NVIDIA H100/B200.
- Guardrails Safety controls and filters that constrain an AI model's inputs and outputs — content classifiers, policy filters, and attack detectors placed around the model.
H
I
- In-Context Learning A language model's ability to learn a new task from examples given in the prompt — without any weight updates — relying on few-shot or zero-shot demonstrations.
- Inference (model serving) The phase where a trained model produces outputs for new inputs; consumes GPU/TPU resources and drives cost, latency, and throughput of AI services.
- Interpretability The research field that seeks to understand the internal mechanisms of AI models — features and circuits — to explain why a model produces a given output.
J
K
- Knowledge distillation Compression technique where a smaller student model learns to mimic the outputs of a larger teacher model, shrinking size while preserving accuracy.
- KV Cache Cached key/value attention tensors that are reused across decoding steps to speed up inference in large language models.
L
- Large Language Model A neural network trained on vast text corpora to predict and generate human language; the foundation of modern AI assistants like ChatGPT, Claude, and Gemini.
- Llama (Meta) A family of open-weight large language models released by Meta, widely used as a base for fine-tuning and on-device deployment by the open-source community.
- LoRA A parameter-efficient fine-tuning technique that freezes the base model's weights and trains small low-rank adapter matrices instead of all parameters.
M
- Mixture of Experts (MoE) A neural network architecture that activates only a subset of its parameters for each input, providing the capability of a much larger model at a fraction of the inference cost.
- Model Context Protocol (MCP) An open protocol introduced by Anthropic in 2024 that standardizes how AI assistants connect to external tools and data sources, similar to how USB-C standardizes physical connections.
- Multi-agent system An AI architecture in which several specialised agents collaborate, delegate, or compete to solve a task more reliably than a single monolithic model could.
- Multimodal model An AI system that, within one model, processes and/or generates multiple modalities — text, images, audio, and video — not just a single data type.
N
O
P
- Prompt engineering The practice of designing inputs to language models so they reliably produce the desired output; covers wording, structure, examples, and system prompts.
- Prompt Injection An attack where untrusted text in an LLM's input causes the model to follow attacker instructions rather than the developer's, ranked #1 in the OWASP Top 10 for LLM applications.
Q
R
- Reasoning Model An LLM trained to produce a long, deliberate chain of thought before its final answer, trading inference time for accuracy on complex problems.
- Red team (AI) Structured adversarial testing of AI systems — prompt injection, jailbreaks, misuse — designed to surface vulnerabilities before production launch.
- Reinforcement Learning A training paradigm where an agent learns to make decisions by interacting with an environment, guided by reward signals; the basis of RLHF and reasoning-model training.
- Reinforcement Learning from Human Feedback (RLHF) A training technique in which human raters rank model responses, and those rankings are used to fine-tune an LLM toward helpfulness and safety.
- Reranking A second retrieval pass that reorders fetched candidates by relevance, typically with a cross-encoder, to sharpen RAG and search results.
- Retrieval-Augmented Generation (RAG) A pattern that combines a search/retrieval system with a language model: the model fetches relevant documents from a knowledge source before answering, grounding output in real data.
S
- Scaling laws Empirical power-law relationships linking model size, training data, and compute to performance; the foundation for planning how large models are trained.
- Self-supervised learning Training approach where a model learns from unlabeled data by creating its own targets, such as predicting hidden tokens within a sentence.
- Speculative Decoding An inference speedup where a small draft model proposes several tokens at once and the large model verifies them in parallel, keeping the same output.
- Stable Diffusion An open-weight latent diffusion model released by Stability AI in 2022; first widely available text-to-image generator runnable on consumer GPUs.
- Sycophancy An AI model's tendency to agree with and flatter the user — telling them what they want to hear rather than what is accurate or warranted.
- Synthetic data Artificially generated data — from a model or simulation — used to augment or replace human-collected data when training and evaluating AI models.
- System Prompt The initial instructions, persona, and policies steering an assistant across a conversation — separate from user messages and given higher priority.
T
- Test-Time Compute Spending extra compute during inference — having the model reason longer before answering — to improve accuracy; the basis of modern reasoning models.
- Tokenization The process of splitting text into smaller units called tokens — words, subwords, or characters — that a language model can process numerically.
- Tool Use A large language model's ability to call external tools, functions, and APIs to act beyond text generation; the core capability underlying AI agents.
- TPU (Tensor Processing Unit) A TPU is Google's custom ASIC chip for accelerating machine learning, optimized for the matrix operations behind training and running neural networks.
- Transformer The neural network architecture introduced in 2017 that powers virtually every modern large language model. Built around the self-attention mechanism.
V
- Vector database Specialized database that stores and queries vector embeddings using semantic similarity; the backbone of modern retrieval-augmented generation.
- Vision-language model An AI model jointly trained on images and text that can "see" an image and reason about it in natural language; the basis of vision in GPT-4o, Claude, Gemini.
W
- Watermarking (AI) Embedding hidden, machine-readable signals in AI-generated text, images, or audio to prove content provenance — for example Google's SynthID.
- World model A learned internal representation of an environment's dynamics that an AI system uses to predict future states and plan actions without constant real-world trial and error.