Hardware
Graphics Processing Unit (GPU)
Graphics Processing Unit with thousands of parallel cores; today the dominant hardware for training and serving AI models, led by NVIDIA H100/B200.
A Graphics Processing Unit (GPU) was originally designed for rendering 3D graphics but turned out to be remarkably well suited to training neural networks. The reason: deep learning reduces to massive parallel matrix operations, and a GPU has thousands of smaller cores that handle exactly that — in contrast to a CPU with a handful of powerful sequential cores.
Layers that matter for AI:
- Tensor / Matrix cores — specialized units for FP16/FP8/INT8 matrix multiplication (NVIDIA Volta+, AMD CDNA)
- HBM memory — High Bandwidth Memory with far more throughput than standard GDDR; the H100 ships 80 GB of HBM3, the B200 192 GB of HBM3e
- Interconnect — NVLink and NVSwitch let 8 to 72 GPUs operate as a single logical system for training
- The CUDA ecosystem — NVIDIA’s software moat; alternatives (ROCm, OneAPI, Triton) are still catching up
Today’s AI economy is deeply tied to GPUs. NVIDIA became the most valuable company in the world during 2024–2025 on the back of H100/B200 shipments. A single-GPU cluster sized for frontier-model training costs hundreds of millions of dollars.
AI accelerators like Google TPUs and AWS Trainium are challenging the monopoly, especially on large language model inference, but for frontier-model training GPUs still dominate in 2026.