How does the AC/DC framework simultaneously evolve models and tasks?

Through open-endedness principles — models are created through merging of different weights, while tasks are generated as synthetic data that push models into new niches. Both processes are coupled within a single run, without manual interventions.

Why do models outperform benchmarks without optimizing for them?

Instead of being trained against fixed benchmarks, models compete against each other on synthetic tasks. The result is a population of experts that naturally covers a broader capability space, with individual niches turning out strong on standard tests without direct targeting.

ArXiv AC/DC: automatic discovery of specialised LLMs through model and task coevolution

A team of authors — Andrew Dai, Boris Meinardus, Ciaran Regan, Yingtao Tian and Yujin Tang — has published a new framework called AC/DC under the full name “Discovering Novel LLM Experts via Task-Capability Coevolution”. The paper was accepted at ICLR 2026 and represents a new approach to LLM development that completely abandons the classic approach of separate training runs.

Problem it solves

Traditionally, when a team wants to expand the capabilities of an LLM, it must run a new separate training run for each new domain. Want a medical expert? Separate run. Legal model? Another one. Financial? A third. Each requires hyperparameters, data, evaluation, regression tests.

AC/DC eliminates that manual group of interventions. The authors claim that “open-endedness — through model and task coevolution — can discover models with ever-novel capabilities in a single run”.

How the framework works

AC/DC simultaneously evolves two components:

Models — through model merging techniques. Instead of training an individual model from scratch, multiple existing models are merged (through various weight combinations) and the resulting population is tested.

Tasks — through synthetic data generation. Each task itself evolves — new, more complex, more nuanced variations of old tasks are generated, and the entire task population pushes models into new niches.

The key is that the two processes are coupled. Models compete on synthetic tasks. Tasks adapt so that some models succeed where others fail. And so on indefinitely — without manual intervention.

Results

The authors report several significant findings:

Discovered populations demonstrate broader expertise coverage than manually curated models
Models outperform larger counterparts with less GPU memory
Continuous innovation shown in both task design and model capabilities
Improved performance in multi-agent best-of-N selection scenarios

It is important to note what is not in the results — there are no claims of domination on specific benchmarks. The authors explicitly do not target SOTA. Instead, they show that the AC/DC model population has richer functional diversity.

New development paradigm

The authors position AC/DC as a “profoundly new paradigm of LLM development”. Instead of the cycle:

Identify use case
Curate data
Run training
Evaluate
Iterate

You have:

Start the framework
Let it discover niches on its own

That is a bold positioning. The question remains how robust it is in practice for enterprise — open-endedness sounds romantic, but production teams typically need predictability.

ICLR 2026 context

AC/DC is one of several 2026 papers from the same research tradition — research on automatic discovery of AI capabilities through evolution. A related line of work connects to earlier “Novelty Search” approaches from evolutionary computing, adapted for AI.

The authors do not cite specific companies or enterprise deployments, which signals this is primarily a research paper in a pre-production phase. Nevertheless, the framework is fascinating because it opens the path toward AI systems that themselves explore the space of possible expertise — without engineers having to define in advance what they want to obtain.

For those tracking long-term AI development trends, AC/DC is a significant data point. For current production systems — almost certainly not directly applicable, but it represents a direction in which the field may move over the next several years.