Large Language Model

A Large Language Model (LLM) is a deep neural network — almost always based on the transformer architecture — trained on hundreds of billions to trillions of words from books, articles, web pages, and code. After training, the model encodes statistical patterns of human language and can generate coherent text in response to a prompt, answer questions, summarize documents, translate, and write code.

LLMs do not “understand” in a human sense. They predict the most likely next token given the preceding context, repeated thousands of times to compose sentences and paragraphs. The illusion of reasoning emerges from scale and the diversity of patterns absorbed during training.

The term gained mainstream traction in 2022 with the release of ChatGPT. Today, “LLM” describes models with anywhere from a few billion to over a trillion parameters, deployed via API (GPT-5, Claude, Gemini), open weights (Llama, Mistral, DeepSeek), or local runtimes (Ollama, llama.cpp).

LLMs are the substrate beneath nearly every AI product covered on this site — agents, chat assistants, RAG pipelines, and reasoning systems all build on top of an LLM.

Sources

See also