🤖 24 AI
🔴 📦 Open Source Saturday, April 18, 2026 · 3 min read

Google Gemma 4: four open models, 31B dense ranks third on the Arena leaderboard, Apache 2.0 license

Why it matters

Gemma 4 is Google's new generation of open models in four variants: E2B for mobile devices, E4B for edge devices, 26B MoE with 3.8 billion active parameters, and 31B dense. The 31B holds third place on the Arena open-model leaderboard and reportedly outperforms models 20 times its size, while the 26B MoE ranks sixth. All models are multimodal (text, image, video, audio), support 140 languages, offer up to 256K token context, and are released under the Apache 2.0 license.

Google DeepMind has announced the new generation of open models called Gemma 4, split into four variants covering the range from mobile devices to high-quality server deployments. The announcement was authored by Clement Farabet (VP of Research) and Olivier Lacombe (Group Product Manager), with the official tagline “Byte for byte, the most capable open models”.

Four variants, one license

Rather than a single flagship model, Google has chosen to cover the entire spectrum of use cases:

  • E2B (Effective 2B) — lightweight model for mobile devices and IoT
  • E4B (Effective 4B) — enhanced edge variant for on-device tasks
  • 26B Mixture of Experts (MoE) — optimised for latency, activates only 3.8 billion parameters during inference
  • 31B dense — highest-quality variant, ideal for fine-tuning

All variants carry the Apache 2.0 license — fully open for commercial use without restrictions, distinguishing them from some other “open” models with more restrictive terms.

Arena rankings and performance

On the Arena AI open-model leaderboard, Gemma 4 holds impressive positions:

  • 31B dense: #3 globally among open models
  • 26B MoE: #6 globally

Google particularly highlights that the 31B model “outperforms models 20 times larger in parameter count” — a claim suggesting Gemma 4 31B is competitive with closed models in the 600B+ parameter range. While the claim is marketing-framed, Arena rankings confirm it through blind user voting.

What’s new: genuine multimodality

Gemma 4 is fully multimodal from the ground up, not a later add-on:

  • Native video and image processing at variable resolutions
  • OCR and chart understanding for analytical tasks
  • Audio support in E2B and E4B variants (speech recognition)
  • Support for 140 languages — significantly above most open models

Context windows differ by variant:

  • Edge models (E2B, E4B): 128K tokens
  • Larger variants (26B, 31B): up to 256K tokens

Additional capabilities include advanced reasoning with multi-step planning, native function-calling for agentic scenarios, and structured JSON output.

Deployment options

Google has calibrated the deployment ecosystem from the smallest to the largest devices:

On-device:

  • Android phones
  • Raspberry Pi
  • NVIDIA Jetson Orin Nano

Cloud:

  • Google Cloud Vertex AI
  • Google Kubernetes Engine (GKE)

Hardware optimisation:

  • NVIDIA GPU (CUDA stack)
  • AMD (ROCm stack)
  • Google TPU (native)

Covering all three major acceleration platforms — including AMD ROCm — means Gemma 4 is not tied to a specific hardware ecosystem, which is important for enterprises seeking deployment flexibility.

Why this matters

Open models have experienced serious quality growth in recent months — DeepSeek, Qwen, Llama and Mistral together form a highly competitive stack. Google had until now been in a follower position in that trend, but Gemma 4 31B at #3 on the Arena signals that Google now holds an edge in the open-model segment.

The combination of performance, Apache 2.0 license, multimodality and broad hardware support makes Gemma 4 a serious choice for all use cases where a closed API is not acceptable — from regulated enterprise to on-device mobile applications. The entire line represents Google’s most ambitious move in the open AI stack to date.

🤖

This article was generated using artificial intelligence from primary sources.