Monday, May 4, 2026

9 articles — 🟡 6 important , 🟢 3 interesting

← Previous day Next day →

🤖 Models (2)

🤝 Agents (4)

🟡 🤝 Agents May 4, 2026 · 2 min read

ArXiv AEM: Adaptive Entropy Modulation for multi-turn RL agents achieves +1.4% on SWE-bench Verified

Editorial illustration: ArXiv AEM: Adaptive Entropy Modulation for multi-turn RL agents achieves +1.4% on SWE-bench Verified

AEM (Adaptive Entropy Modulation) is a supervision-free training method that dynamically modulates entropy across multi-turn conversations to balance exploration and exploitation in RL-trained agentic LLMs. Tested on models from 1.5B to 32B parameters, it delivers a 1.4% improvement when integrated into a state-of-the-art baseline on SWE-bench Verified.

🟡 🤝 Agents May 4, 2026 · 2 min read

Position paper by 30 authors at ICML 2026: agentic AI orchestration must be Bayes-consistent

Editorial illustration: Position paper by 30 authors at ICML 2026: agentic AI orchestration must be Bayes-consistent

Thirty researchers from academic and industrial laboratories published a position paper accepted at ICML 2026 arguing that the control layer of agentic AI systems must respect Bayesian consistency. The authors hold that LLMs are unsuitable for decisions under uncertainty, but that an orchestrator above them can and must maintain calibrated beliefs and use utility-aware policies.

🟡 🤝 Agents May 4, 2026 · 3 min read

ArXiv 'To Call or Not to Call' framework reveals LLMs misjudge when they need external tools

Editorial illustration: ArXiv 'To Call or Not to Call' framework reveals LLMs misjudge when they need external tools

Researchers from Max Planck Institute for Software Systems and collaborators published a framework evaluating tool-calling decisions of LLM agents across three dimensions: necessity, benefit, and cost acceptability. Experiments on six models and three tasks reveal a significant gap between what the model thinks it needs and what actually increases accuracy — directly affecting the cost and reliability of production agents.

🟢 🤝 Agents May 4, 2026 · 2 min read

ArXiv: the hidden cost of tools in LLM agents — 'tool-use tax' reduces accuracy even when tools help

Editorial illustration: ArXiv: the hidden cost of tools in LLM agents — tool-use tax reduces accuracy even when tools help

Researchers have shown that calling tools in LLM agents introduces a hidden cost — the 'tool-use tax' — arising from call formatting and protocol overhead. Using a Factorized Intervention Framework they isolate three cost components and introduce a G-STEP gate that partially mitigates losses without changing the model.

🏥 In Practice (1)

🛡️ Security (2)

← Previous day Next day →