NVIDIA Nemotron 3 Nano Omni: open multimodal model 30B-A3B MoE with 256K context, 9× higher throughput than competitors
Nemotron 3 Nano Omni is NVIDIA's new open multimodal model that unifies vision, speech, and language in a single 30B-A3B hybrid mixture-of-experts system with 256K context. It achieves top accuracy on six leaderboards for document intelligence and audio-video understanding, with 9× higher throughput than other open omni models at the same interactivity level. Available immediately on HuggingFace, OpenRouter, NVIDIA NIM, and 25+ partner platforms; Foxconn, Palantir, and six other companies are already using the model in production.