Foundations
Neural network
A computing system of layered artificial neurons that learns patterns from data; the foundation of nearly all modern machine learning, including LLMs.
A neural network is a computational model loosely inspired by the brain. It consists of layers of simple units — artificial neurons — that take weighted inputs, apply a non-linear activation function, and pass the result to the next layer. By adjusting the weights through training on labelled or self-supervised data, the network learns to map inputs (pixels, audio, text tokens) to outputs (classes, predictions, embeddings).
A typical network has an input layer, one or more hidden layers, and an output layer. When the hidden stack is deep — dozens to hundreds of layers — the field is called deep learning. Training relies on backpropagation: the error at the output is differentiated through the network and used to nudge each weight in the right direction via stochastic gradient descent.
Modern neural networks come in many shapes: convolutional networks for images, recurrent networks for time series, graph networks for relational data, and most importantly the transformer which underlies today’s large language models. The same core idea — composing simple differentiable units into a learnable pipeline — drives systems for vision, speech, robotics, drug discovery, and code generation.