ArXiv AC/DC: automatic discovery of specialised LLMs through model and task coevolution
Why it matters
AC/DC is a new framework presented at ICLR 2026 that simultaneously evolves LLM models through model merging and tasks through synthetic data. Discovered model populations demonstrate broader expertise coverage than manually curated models without explicit benchmark optimization. Models outperform larger counterparts with less GPU memory, representing a new paradigm in continuous LLM development.
A team of authors — Andrew Dai, Boris Meinardus, Ciaran Regan, Yingtao Tian and Yujin Tang — has published a new framework called AC/DC under the full name “Discovering Novel LLM Experts via Task-Capability Coevolution”. The paper was accepted at ICLR 2026 and represents a new approach to LLM development that completely abandons the classic approach of separate training runs.
Problem it solves
Traditionally, when a team wants to expand the capabilities of an LLM, it must run a new separate training run for each new domain. Want a medical expert? Separate run. Legal model? Another one. Financial? A third. Each requires hyperparameters, data, evaluation, regression tests.
AC/DC eliminates that manual group of interventions. The authors claim that “open-endedness — through model and task coevolution — can discover models with ever-novel capabilities in a single run”.
How the framework works
AC/DC simultaneously evolves two components:
Models — through model merging techniques. Instead of training an individual model from scratch, multiple existing models are merged (through various weight combinations) and the resulting population is tested.
Tasks — through synthetic data generation. Each task itself evolves — new, more complex, more nuanced variations of old tasks are generated, and the entire task population pushes models into new niches.
The key is that the two processes are coupled. Models compete on synthetic tasks. Tasks adapt so that some models succeed where others fail. And so on indefinitely — without manual intervention.
Results
The authors report several significant findings:
- Discovered populations demonstrate broader expertise coverage than manually curated models
- Models outperform larger counterparts with less GPU memory
- Continuous innovation shown in both task design and model capabilities
- Improved performance in multi-agent best-of-N selection scenarios
It is important to note what is not in the results — there are no claims of domination on specific benchmarks. The authors explicitly do not target SOTA. Instead, they show that the AC/DC model population has richer functional diversity.
New development paradigm
The authors position AC/DC as a “profoundly new paradigm of LLM development”. Instead of the cycle:
- Identify use case
- Curate data
- Run training
- Evaluate
- Iterate
You have:
- Start the framework
- Let it discover niches on its own
That is a bold positioning. The question remains how robust it is in practice for enterprise — open-endedness sounds romantic, but production teams typically need predictability.
ICLR 2026 context
AC/DC is one of several 2026 papers from the same research tradition — research on automatic discovery of AI capabilities through evolution. A related line of work connects to earlier “Novelty Search” approaches from evolutionary computing, adapted for AI.
The authors do not cite specific companies or enterprise deployments, which signals this is primarily a research paper in a pre-production phase. Nevertheless, the framework is fascinating because it opens the path toward AI systems that themselves explore the space of possible expertise — without engineers having to define in advance what they want to obtain.
For those tracking long-term AI development trends, AC/DC is a significant data point. For current production systems — almost certainly not directly applicable, but it represents a direction in which the field may move over the next several years.
This article was generated using artificial intelligence from primary sources.
Related news
Thinking with Reasoning Skills (ACL 2026 Industry Track): fewer tokens, higher accuracy through retrieval of reasoning skills
DeepSeek releases V4-Pro and V4-Flash: two open-source models with one million token context and 80.6 on SWE Verified
OpenAI introduces GPT-5.5: the smartest model for coding, research, and complex data analysis through tools