🤖 24 AI
🟢 🤝 Agents Thursday, April 16, 2026 · 2 min read

ArXiv: TREX — Two AI Agents Automate the Entire LLM Fine-Tuning Process

Why it matters

TREX is a new multi-agent system that automates the complete fine-tuning pipeline for large language models — from requirements analysis and literature search to data preparation and results evaluation. The system models the experimental process as a search tree, and on the FT-Bench benchmark with 10 real-world tasks, it consistently optimizes model performance.

The Problem: Fine-Tuning Requires Too Much Human Effort

Fine-tuning large language models — the process of adapting a pre-trained model to a specific task — currently requires significant human expertise. A researcher must analyze requirements, search the relevant literature, prepare data, select hyperparameters, run experiments, and evaluate results. Each of these steps involves a series of decisions that rely on experience and intuition.

Researchers Zerun Ma, Guoqiang Wang, and Xinchen Xie propose TREX — a system that automates this entire process using two coordinated AI agents.

How Does TREX Work?

The system is built on two modules. The Researcher takes on the tasks of requirements analysis, literature and data source search, and training strategy formulation. The Executor implements concrete experiments — from preparing data recipes to running training and evaluating results.

The key innovation is modeling the experimental process as a search tree. Each node in the tree represents a specific training configuration, and branches lead to variations. The system can efficiently plan exploration paths, reuse results from previous experiments, and draw conclusions from iterative attempts — rather than starting each experiment from scratch.

Results on the FT-Bench Benchmark

For evaluation, the researchers developed FT-Bench — a benchmark with 10 real-world tasks covering a range from optimizing foundational capabilities to improving domain-specific performance. Results show that the TREX agent “consistently optimizes model performance on target tasks.”

For teams that regularly fine-tune models, TREX promises a significant reduction in time and experimentation costs — by automating the routine steps currently performed by expensive ML engineers.

🤖

This article was generated using artificial intelligence from primary sources.