🟢 🔧 Hardware Published: · 2 min read ·

AMD: Alibaba's ROLL framework runs natively on Instinct GPUs

Editorial illustration: Alibaba's ROLL framework runs natively on Instinct GPUs

AMD announced that Alibaba's open-source reinforcement-learning framework ROLL now runs natively on AMD Instinct GPUs with the ROCm software, with no code changes, custom patches or non-standard builds. The collaboration includes vLLM compatibility, fixes for Ray and support for distributed RL training of large language models.

🤖

This article was generated using artificial intelligence from primary sources.

On its ROCm blog, AMD described a collaboration with Alibaba that enabled the open-source reinforcement-learning framework ROLL to run natively on AMD Instinct GPUs with the ROCm software stack. The key message is that the framework runs “out-of-the-box,” with no code changes, custom patches or non-standard builds.

What is ROLL?

ROLL is an open-source framework developed by Alibaba for large, distributed reinforcement-learning workflows on large language models (LLMs). Reinforcement learning is a method in which a model learns through rewards for desirable behaviors. ROLL supports algorithms such as PPO, GRPO, DPO and RLHF, asynchronous execution and native agentic training.

How was vLLM and Ray compatibility addressed?

AMD added support for both generations of the vLLM engine, the legacy v0 and the newer v1 that offers better throughput. vLLM is a library for fast inference of language models. For “sleep mode” behavior, vLLM versions 0.11.0 and newer are fully supported, while older ones require a special ROCm branch. In addition, AMD contributed fixes for Ray (version 2.48 and newer) that resolve mismatches in GPU device visibility, i.e. compatibility between the HIP_VISIBLE_DEVICES and CUDA_VISIBLE_DEVICES variables.

What does this enable?

The framework supports single-node training and distributed training across multiple nodes, with example configurations for models such as Qwen 2.5-7B and tunable GPU memory utilization parameters. For users of AMD hardware, this means they can run demanding RL training of language models without dependence on someone else’s hardware ecosystem.

Frequently Asked Questions

What is ROLL?
ROLL is an open-source reinforcement-learning framework developed by Alibaba for large, distributed RL workloads on language models, with support for PPO, GRPO, DPO and RLHF.
Do you need to change code to run it on AMD GPUs?
No. AMD states that ROLL runs out-of-the-box on Instinct GPUs with ROCm, with no code changes, custom patches or non-standard builds.