🟡 🔧 Hardware Published: · 2 min read ·

AMD: ROCm 7.13 brings MI350P GPU, multi-VF virtualisation and TheRock packaging

Editorial illustration: AMD ROCm 7.13 with MI350P GPU, multi-VF virtualisation and TheRock modular packaging

AMD released ROCm 7.13 on 20 May 2026 — a new version of its open-source AI compute stack that introduces support for the MI350P GPU, virtualisation of up to 8 isolated vGPUs per MI300X accelerator, an open-source ROCprof Trace decoder for transparent performance analysis, and modular TheRock packaging with domain-specific SDKs. The release is validated on Ubuntu 26.04 and RHEL 9.6, and includes VMware ESXi 9.1 support for MI350X and MI355X.

🤖

This article was generated using artificial intelligence from primary sources.

AMD released ROCm 7.13 on 20 May 2026 — a new version of its open-source AI compute stack that serves as the primary alternative to the NVIDIA CUDA ecosystem. The release introduces support for the MI350P GPU, multi-VF virtualisation, an open-source ROCprof Trace decoder, and TheRock modular packaging.

What does ROCm 7.13 bring to enterprise virtualisation?

The biggest enterprise news is multi-VF (Virtual Function) support: up to 8 isolated vGPUs per single MI300X physical accelerator. This lets multiple tenants or multiple models share the same hardware with memory isolation — a critical requirement for multi-tenant cloud providers and on-prem AI platforms seeking better utilisation of expensive accelerators. Integration with VMware ESXi 9.1 further extends applicability to MI350X and MI355X models.

Why does the open-source ROCprof Trace decoder matter?

Performance profiling has long been an area where AMD lagged behind NVIDIA’s Nsight Systems. ROCm 7.13 introduces an open-source ROCprof Trace decoder that provides transparent visibility into GPU instructions, memory traffic, and kernel latency. The open-source nature of the tool means that third parties (e.g. Hugging Face, MosaicML, the vLLM team) can write specialised analytical tools on top of the decoder, accelerating ecosystem development.

What is TheRock and how does it change deployment?

TheRock is AMD’s new modular packaging format for ROCm. Previously, ROCm arrived as a monolithic stack requiring around 12 GB of installation. TheRock enables separate domain-specific SDKs for HPC, computer vision, data science, and life sciences, with users installing only the components they need. This reduces installation size, speeds up patch cycles, and narrows the attack surface from a security perspective.

Validation and support

ROCm 7.13 is validated on Ubuntu 26.04 and Red Hat Enterprise Linux 9.6, the two most widely used enterprise Linux distributions. AMD simultaneously released an update to the QuickReduce library featuring FP4 quantisation for MI355 GPUs, claiming a 4.1× speed-up over standard RCCL for multi-GPU communication of larger messages — a complementary release that further narrows the performance gap with CUDA.

Frequently Asked Questions

What is new in the MI350P GPU?
The MI350P is a new AMD AI accelerator from the Instinct line, optimised for inference and fine-tuning workloads. It is supported in ROCm 7.13 for bare-metal and Kubernetes deployment, and integration with VMware ESXi 9.1 enables enterprise virtualisation.
What does multi-VF virtualisation specifically enable?
Multi-VF (Virtual Function) allows up to 8 isolated virtual GPUs per single MI300X physical accelerator. This lets multiple tenants or multiple models share the same hardware with memory isolation, which is critical for multi-tenant cloud and on-prem AI platforms.
What is TheRock and why does it matter?
TheRock is AMD's new modular packaging format for ROCm with optional domain-specific SDKs — separate packages for HPC, computer vision, data science, and life sciences. This reduces installation size and administrative overhead because users install only what they need.