Models

Google Gemini

A family of multimodal foundation models from Google DeepMind handling text, images, audio, and video; powers Gemini app, Workspace, and Vertex AI.

Google Gemini is the family of multimodal foundation models built by Google DeepMind, announced in December 2023 as the successor to the earlier PaLM and LaMDA lines. Gemini was designed from the ground up to be natively multimodal, handling text, images, audio, video, and code within a single large language model backbone rather than bolting modalities onto a text model.

The line includes several tiers: Gemini Nano runs on devices, Flash targets high-throughput cloud inference, Pro is the everyday workhorse, and Ultra/Advanced sits at the frontier alongside GPT and Claude. Subsequent versions — Gemini 1.5, 2.0, 2.5 — pushed context windows to a million tokens and beyond, added native tool use, and matured into AI agents capable of operating browsers and codebases.

Gemini is the engine behind the Gemini consumer app, Google Workspace AI features (Docs, Gmail, Sheets, Meet), Android assistant features, and the Vertex AI platform for developers. Google also offers Gemma — open-weight derivatives that share architectural ideas with Gemini but are released for the open-source community.

Sources

See also