Llama (Meta)

Llama is Meta’s family of open-weight large language models, launched as LLaMA in February 2023 and continued through Llama 2 (mid-2023), Llama 3 (2024), and the Llama 4 generation. Each release ships several sizes — typically 7B / 8B, 70B, and a much larger flagship — together with instruction-tuned variants intended for chat and assistant use cases.

What sets Llama apart is the licence: weights are downloadable for free, with terms permissive enough to cover most commercial use. That has made Llama the default starting point for community fine-tuning, domain adaptation, on-device inference, and academic research. A large ecosystem — Hugging Face, Ollama, llama.cpp, vLLM, LM Studio — exists primarily to serve and customise Llama-style models.

Architecturally, Llama is a decoder-only transformer similar to other modern LLMs, with refinements such as RMSNorm, rotary position embeddings, grouped-query attention, and SwiGLU activations. Llama 4 introduced multimodality and mixture-of-experts. Together with Mistral, DeepSeek, and Qwen, Llama defines the open-weight frontier and is the reason much of today’s AI tooling can run outside hyperscaler clouds.

Sources

See also