RecursiveMAS: multi-agent systems with +8.3% F1 and 75% fewer tokens

RecursiveMAS is a new framework for multi-agent systems that extends recursive computation (looped LLMs) from a single model to multiple collaborating agents connected through a lightweight RecursiveLink module. Evaluation across 9 benchmarks (math, science, medicine, code) yields an average 8.3% accuracy improvement, 1.2-2.4× faster inference, and 34.6-75.6% lower token consumption.

A team of twelve researchers (Yang, Zou, Pan, Qiu, Lu, Diao, Jiang, Tong, Zhang, Buehler, He, Zou) published the preprint RecursiveMAS — Recursive Multi-Agent Systems on April 28, 2026 — a framework that extends the paradigm of recursive computation from a single “looped” LLM to the collaboration of multiple agents. The paper is 36 pages long and is accompanied by a code and data release.

What Is New in the Architecture?

Standard multi-agent systems connect agents through text messages — agent A writes a prompt, agent B responds, agent A reads the output again. This approach is expensive (token cost) and loses information (language as a bottleneck). RecursiveMAS introduces a lightweight RecursiveLink module that transfers latent states between heterogeneous agents — that is, without communication through natural language.

Training proceeds through an inner-outer loop algorithm that enables joint optimization of agents through “shared gradient-based credit assignment across recursion rounds.” The theoretical analysis provided by the authors confirms more stable gradient dynamics compared to classic multi-agent schemes.

Numbers That Justify the Paper

The evaluation was conducted on 9 benchmarks covering math, natural sciences, medicine, code generation, and information retrieval, and four collaboration patterns among agents. Average results:

+8.3% accuracy improvement over baseline multi-agent systems
1.2× to 2.4× faster end-to-end inference
34.6% to 75.6% reduction in token consumption

Why This Matters for Enterprise

Two of the three industry adoption barriers for multi-agent architectures are cost (each agent generates tokens) and latency (sequential text hand-off). RecursiveMAS attempts to address both simultaneously. The combination of fewer tokens (-75% in the best case) and faster inference (up to 2.4×) suggests the approach could move multi-agent systems closer to real production deployments.

Code and datasets are announced alongside the preprint, enabling independent replication.

Frequently Asked Questions

What does RecursiveMAS add to existing multi-agent systems?

The RecursiveLink module transfers latent states between agents without text-based communication, enabling shared gradient-based credit assignment across recursive rounds and more stable gradient dynamics.

How large is the performance gain?

An average of +8.3% accuracy, 1.2-2.4× faster end-to-end inference, and 34.6-75.6% fewer tokens across 9 diverse benchmarks covering math, science, medicine, code, and search.

Who are the authors?

A team of twelve researchers including Xiyuan Yang, Jiaru Zou, Markus J. Buehler, Hanghang Tong, Jindong Jiang, and James Zou. The paper was published on April 28, 2026 as a 36-page preprint.

RecursiveMAS Extends Recursive Computation from Single Models to Multi-Agent Systems: +8.3% Accuracy with 34-75% Fewer Tokens

What Is New in the Architecture?

Numbers That Justify the Paper

Why This Matters for Enterprise

Frequently Asked Questions

Sources

Related news