arXiv:2605.06623: MASPO — automatic prompt optimization for multi-agent LLM systems, ICML 2026
MASPO is a framework for joint prompt optimization in multi-agent LLM systems using evolutionary beam search. It achieves an average improvement of 2.9 percentage points across six tasks and has been accepted at ICML 2026.
This article was generated using artificial intelligence from primary sources.
A research team led by Zhexuan Wang and Xuebo Liu published a paper on MASPO (Multi-Agent System Prompt Optimization) on arXiv on May 7, 2026 — a framework for joint prompt optimization in multi-agent LLM systems. The paper was accepted at ICML 2026, and the code is available on GitHub under the CC BY 4.0 license.
What problem does MASPO solve?
In systems with multiple collaborating LLM agents, each agent has its own prompt — but joint optimization of all prompts remains a hard problem due to the misalignment between local agent objectives and the holistic system goal. Traditional approaches evaluate prompts in isolation, missing inter-agent interactions and potentially leading to suboptimal global outcomes.
How does joint evaluation work?
MASPO does not measure a prompt by its isolated outcome, but rather by its “capacity to facilitate the success of downstream agents.” When one agent generates output, MASPO assesses how much that output helps subsequent agents in the chain — linking local interactions to global system metrics without requiring labeled data. The optimization uses evolutionary beam search that data-efficiently navigates the high-dimensional prompt space.
How large is the improvement in practice?
Across six different evaluation tasks, MASPO achieves an average improvement of 2.9 percentage points in accuracy compared to currently best-performing prompt optimization methods. The authors emphasize that results are consistent across tasks, suggesting the approach is not sensitive to the specific application domain.
What is publicly available?
Alongside the arXiv preprint, the authors (Zhexuan Wang, Xuebo Liu, Li Wang, Zifei Shan, Yutong Wang, Zhenxi Song, Min Zhang) released source code on GitHub, enabling reproduction of experiments and application to new multi-agent configurations.
Frequently Asked Questions
- What is MASPO?
- MASPO (Multi-Agent System Prompt Optimization) is a framework that automatically and iteratively refines prompts throughout an entire multi-agent LLM system, evaluating each prompt based on the success of downstream agents.
- What is the main methodological innovation?
- Instead of evaluating prompts in isolation, MASPO measures how each prompt affects the success of downstream agents, linking local objectives to the holistic system goal without labeled data.
- How large is the performance gain?
- Across six evaluation tasks, MASPO achieves an average improvement of 2.9 percentage points in accuracy compared to the best existing prompt optimization methods.
Sources
Related news
arXiv:2605.06177: BioMedArena — toolkit for biomedical AI agents with 147 benchmarks and 75 tools
Google DeepMind: AlphaEvolve available through Google Cloud, first industrial results
arXiv:2605.05191: LongSeeker with Context-ReAct framework achieves 61.5% on BrowseComp