LangChain RLM Agents: 79% Better on 128k Tokens Than Standard

LangChain has introduced Recursive Language Models (RLM) through its DeepAgents framework — an approach in which models call themselves over input slices instead of feeding the entire context into a single window. On the OOLONG benchmark task with 128k tokens, RLM agents scored 0.79 versus 0.44 for standard agents, an improvement of 79 percent.

LangChain has published a detailed guide and framework for Recursive Language Models (RLM) within its DeepAgents ecosystem — an approach that addresses one of the chronic problems of LLM agents: the performance degradation that occurs on long contexts, better known as “context rot.”

Why Do Standard Agents Lose the Battle on Long Contexts?

When an agent accumulates a large amount of information — prior messages, tool results, intermediate outputs — all of it ends up in a single context window. Models then start missing relevant details from earlier parts of the context, ignoring instructions, or mis-prioritizing information. On tasks with 128k tokens, standard agents in testing routinely fail or collapse entirely.

The RLM approach, developed by researchers Alex Zhang and MIT CSAIL, solves this problem through a structural change: instead of pushing the entire input into the context window, the model loads it as a variable in a REPL environment and calls itself recursively — on itself or on subagents — over smaller, manageable slices.

How RLM Orchestration Works

The core of the approach is code-driven orchestration through a lightweight code interpreter (QuickJS). The model writes code that decomposes the task and runs recursive calls over data segments. LangChain implements this through “dynamic subagents” — subagents dispatched programmatically through code, not through sequential tool calls.

The key advantage of this architecture is deterministic coverage: loops in the code guarantee that every element will be processed, unlike an approach where the model itself decides what to read. Pipelines can be branched, parallelized, or sequenced depending on task requirements. Additionally, mixing different models across orchestrator and subagent layers enables precise cost optimization — reserving expensive models for complex steps and cheaper ones for routine work.

Benchmark Results

LangChain tested the approach on the OOLONG benchmark task — classifying news from the AgNews dataset into four categories — at different context lengths:

Context length	Without REPL	With REPL (RLM)
64k tokens	0.58	0.67
128k tokens	0.44	0.79

At 128k tokens, RLM agents scored 0.79 versus 0.44 for standard agents — a relative improvement of 79 percent. At that context length, standard agents failed completely in a significant number of cases. RLM agents maintained high accuracy despite the higher latency inherent in the recursive approach.

Installation and Code Example

Setting up the DeepAgents framework with RLM support requires a single command:

pip install -U "deepagents[quickjs]"

Basic agent initialization:

from deepagents import create_deep_agent
from langchain_quickjs import CodeInterpreterMiddleware

agent = create_deep_agent(
    model="openai:gpt-5.5",
    middleware=[CodeInterpreterMiddleware()],
)

RLM orchestration is activated by including the keyword “workflow” in the prompt, signaling to the agent that it should use dynamic subagent dispatch. The framework supports model mixing across layers, meaning users can specify different LLMs for the orchestrator and for subagents.

The approach is compatible with existing LangChain ecosystem tools and requires no infrastructure changes — just a package upgrade and adding the middleware layer at agent initialization.

Frequently Asked Questions

What are Recursive Language Models (RLM) and why are they useful?

RLMs are models that load input as variables in a REPL environment and recursively call themselves or subagents over smaller slices. The goal is to avoid 'context rot' — the performance drop that occurs when an agent accumulates too much context in a single window.

How are RLM agents installed and activated?

Installation is done with `pip install -U 'deepagents[quickjs]'`, and RLM orchestration is activated by passing `CodeInterpreterMiddleware` when creating an agent with the `create_deep_agent` function.

What are the advantages of code-driven orchestration over standard LLM agents?

Code guarantees deterministic coverage of every element through loops — unlike a model deciding on its own what to process. Pipelines can be branched, parallelized, or sequenced as the task demands, and costs can be optimized by mixing different models across orchestrator and subagent layers.

LangChain Introduces RLM Agents: Recursive Models Achieve 79% Better Results on Long Contexts

Why Do Standard Agents Lose the Battle on Long Contexts?

How RLM Orchestration Works

Benchmark Results

Installation and Code Example

Frequently Asked Questions

Sources

Related news