What are the four Gemma 4 variants?

E2B (Effective 2B) for mobile devices, E4B (Effective 4B) for edge, 26B Mixture of Experts (activates 3.8B parameters) and 31B dense as the highest-quality option for fine-tuning.

How does Gemma 4 31B rank on the Arena AI leaderboard?

The 31B dense holds third place on the Arena open-model list globally, and Google claims it outperforms models 20 times larger in parameter count.

Where can Gemma 4 be deployed?

On Android devices, Raspberry Pi, NVIDIA Jetson Orin Nano, Google Cloud, Vertex AI and GKE. Hardware optimization covers NVIDIA, AMD ROCm and Google TPU.

Google Gemma 4: four open models, 31B dense ranks third on the Arena leaderboard, Apache 2.0 license

Google DeepMind has announced the new generation of open models called Gemma 4, split into four variants covering the range from mobile devices to high-quality server deployments. The announcement was authored by Clement Farabet (VP of Research) and Olivier Lacombe (Group Product Manager), with the official tagline “Byte for byte, the most capable open models”.

Four variants, one license

Rather than a single flagship model, Google has chosen to cover the entire spectrum of use cases:

E2B (Effective 2B) — lightweight model for mobile devices and IoT
E4B (Effective 4B) — enhanced edge variant for on-device tasks
26B Mixture of Experts (MoE) — optimised for latency, activates only 3.8 billion parameters during inference
31B dense — highest-quality variant, ideal for fine-tuning

All variants carry the Apache 2.0 license — fully open for commercial use without restrictions, distinguishing them from some other “open” models with more restrictive terms.

Arena rankings and performance

On the Arena AI open-model leaderboard, Gemma 4 holds impressive positions:

31B dense: #3 globally among open models
26B MoE: #6 globally

Google particularly highlights that the 31B model “outperforms models 20 times larger in parameter count” — a claim suggesting Gemma 4 31B is competitive with closed models in the 600B+ parameter range. While the claim is marketing-framed, Arena rankings confirm it through blind user voting.

What’s new: genuine multimodality

Gemma 4 is fully multimodal from the ground up, not a later add-on:

Native video and image processing at variable resolutions
OCR and chart understanding for analytical tasks
Audio support in E2B and E4B variants (speech recognition)
Support for 140 languages — significantly above most open models

Context windows differ by variant:

Edge models (E2B, E4B): 128K tokens
Larger variants (26B, 31B): up to 256K tokens

Additional capabilities include advanced reasoning with multi-step planning, native function-calling for agentic scenarios, and structured JSON output.

Deployment options

Google has calibrated the deployment ecosystem from the smallest to the largest devices:

On-device:

Android phones
Raspberry Pi
NVIDIA Jetson Orin Nano

Cloud:

Google Cloud Vertex AI
Google Kubernetes Engine (GKE)

Hardware optimisation:

NVIDIA GPU (CUDA stack)
AMD (ROCm stack)
Google TPU (native)

Covering all three major acceleration platforms — including AMD ROCm — means Gemma 4 is not tied to a specific hardware ecosystem, which is important for enterprises seeking deployment flexibility.

Why this matters

Open models have experienced serious quality growth in recent months — DeepSeek, Qwen, Llama and Mistral together form a highly competitive stack. Google had until now been in a follower position in that trend, but Gemma 4 31B at #3 on the Arena signals that Google now holds an edge in the open-model segment.

The combination of performance, Apache 2.0 license, multimodality and broad hardware support makes Gemma 4 a serious choice for all use cases where a closed API is not acceptable — from regulated enterprise to on-device mobile applications. The entire line represents Google’s most ambitious move in the open AI stack to date.