ONNX v1.22.0 Brings Native Attention Operators for LLMs and WebAssembly Support
LF AI & Data Foundation has released ONNX v1.22.0 with native attention operators for transformer architectures and LLMs, WebAssembly support for in-browser model inspection, and SLSA Level 2 cryptographic attestations. Twenty-seven contributors participated, 16 of them for the first time.
This article was generated using artificial intelligence from primary sources.
On June 30, 2026, the LF AI & Data Foundation released ONNX v1.22.0 — a new version of the open standard for exchanging AI models between frameworks and hardware runtimes. This release brings three key additions: native support for attention operators, WebAssembly integration for in-browser model inspection, and enhanced supply-chain security.
Native Attention Operators for Modern LLMs
The most important technical change in ONNX v1.22.0 is the introduction of native attention operators — primitive operators that directly describe the attention mechanisms used in transformer architectures and LLMs. Until now, attention layers in ONNX models were expressed as compositions of lower-level operators — matrix multiplication, softmax, and reshape operations — which made it harder for hardware runtimes to apply specialized optimizations.
With the new operators, hardware vendors can implement kernels tailored to attention mechanisms, directly improving throughput for long sequences and streaming applications. This is especially relevant for LLMs with context windows of hundreds of thousands of tokens, where attention computation dominates total inference cost.
Why Does Interoperability Matter for Generative AI?
ONNX exists so that a model trained in one framework can run in another — without retraining or manual conversion. A PyTorch model becomes an ONNX artifact, and runtimes can then execute it on CPU (Intel OpenVINO), GPU (NVIDIA TensorRT), mobile chips (Qualcomm QNN), or specialized accelerators — all without changing the model.
Up to this version, attention layers had no native representation in ONNX. That was a structural gap between how modern LLMs are described internally and how the ONNX schema could express them. Without native attention operators, a runtime could not recognize the pattern and exploit a specialized hardware path — it had to process attention as a series of generic operations.
ONNX v1.22.0 closes that gap. Modern transformer architectures are now first-class citizens of the ONNX ecosystem, meaning frameworks such as PyTorch, TensorFlow, and scikit-learn can express LLMs in ONNX format without losing information about key computational patterns.
WebAssembly and Supply-Chain Security
Version 1.22.0 introduces WebAssembly support via Pyodide integration. ONNX models can now be inspected and validated directly in the browser, without a local Python or ONNX library installation. Tools for model graph inspection, shape-inference verification, and operator compatibility checking are available to anyone with a URL — no development environment setup required.
On the security side, every ONNX release from this version onward carries SLSA Level 2 cryptographic attestations about code provenance — a reproducible and verifiable record of where and how the artifact was built. In addition, each package now includes an embedded SBOM (Software Bill of Materials) listing all dependencies, versions, and licenses. This is a direct response to growing regulatory and business requirements for supply-chain transparency in open-source AI tooling.
The modernized build system ensures reproducible builds across all three platforms: Linux, macOS, and Windows. For teams that automate CI/CD pipelines with ONNX conversions, build reproducibility means predictable results without dependency on build environment state.
Community, Bug Fixes, and Roadmap
ONNX v1.22.0 brought together 27 contributors, 16 of whom were contributing for the first time. Shape-inference helper functions and the version converter used by frameworks when converting models between ONNX opset versions have been improved. Alongside the attention operators, correctness has been fixed for several key operators across a wider range of inputs, reducing discrepancies between the specification and actual behavior on edge cases.
The roadmap for future versions announces support for probabilistic and Bayesian inference, extended quantization, and further improvements to shape inference.
ONNX v1.22.0 is available on the GitHub repository github.com/onnx/onnx and through standard package managers.
Frequently Asked Questions
- What are the attention operators in ONNX v1.22.0?
- They are native operators that directly describe attention mechanisms in transformer architectures and LLMs, enabling hardware runtimes to implement specialized optimizations for long sequences and streaming applications.
- How does ONNX v1.22.0 improve supply-chain security?
- Every release now includes SLSA Level 2 cryptographic attestations about code provenance and an embedded Software Bill of Materials listing all dependencies, versions, and licenses.
- What does WebAssembly support in ONNX enable?
- ONNX models can now be inspected and validated directly in the browser without a local installation, thanks to integration with Pyodide.
Related news
CNCF Kepler Rearchitected From the Ground Up: Precise Pod Energy Measurement Without Kernel Privileges
Miles: A PyTorch-Native Open-Source Framework for RL Post-Training of Frontier-Scale LLMs
NVIDIA: Palantir and NVIDIA Nemotron Bring Sovereign AI to US Agencies in Air-Gapped Systems