🤖 24 AI

Thursday, April 16, 2026

17 articles — 🔴 2 critical , 🟡 10 important , 🟢 5 interesting

← Previous day Next day →

🤖 Models (2)

📦 Open Source (1)

⚖️ Regulation (1)

🤝 Agents (3)

🟡 🤝 Agents April 16, 2026 · 2 min read

OpenAI: Next-Generation Agents SDK Introduces Native Sandbox Execution for Reliable Agents

OpenAI has announced a significant upgrade to its Agents SDK, introducing native sandbox execution and a model-native harness for building more reliable long-running AI agents. The new release focuses on code execution safety and agent autonomy, enabling development teams to build agents that can operate for hours without human supervision while maintaining reliability.

🟢 🤝 Agents April 16, 2026 · 2 min read

ArXiv: TREX — Two AI Agents Automate the Entire LLM Fine-Tuning Process

TREX is a new multi-agent system that automates the complete fine-tuning pipeline for large language models — from requirements analysis and literature search to data preparation and results evaluation. The system models the experimental process as a search tree, and on the FT-Bench benchmark with 10 real-world tasks, it consistently optimizes model performance.

🟢 🤝 Agents April 16, 2026 · 2 min read

IBM Research: VAKRA Benchmark Reveals AI Agents Fail on Complex Reasoning

IBM Research has published VAKRA — a new benchmark for evaluating AI agents in enterprise environments, comprising more than 8,000 local APIs, 62 domains, and 4,187 test instances. The key finding is that models display surface-level competence on simple tasks but fail on compositional reasoning, multi-hop reasoning degrades with depth, and adherence to external constraints causes a significant performance drop.

🔧 Hardware (2)

🏥 In Practice (2)

💬 Community (1)

🛡️ Security (5)

🔴 🛡️ Security April 16, 2026 · 3 min read

ArXiv: MemJack — Multi-Agent Attack Breaks Vision-Language Model Defenses with Up to 90% Success Rate

MemJack is a new jailbreak framework targeting vision-language models (VLMs) that uses coordinated multi-agent collaboration instead of classical pixel perturbations. Tested on unmodified COCO images, it achieves a 71.48% success rate on Qwen3-VL-Plus, rising to 90% with an expanded budget. Researchers plan to publicly release over 113,000 interactive attack trajectories to support defensive research.

🔴 🛡️ Security April 16, 2026 · 3 min read

OpenAI: Trusted Access for Cyber Program Brings $10 Million for Global Cybersecurity Defense

OpenAI has launched the Trusted Access for Cyber initiative, bringing together leading security organizations and enterprise users around the specialized GPT-5.4-Cyber model. The program includes $10 million in API grants aimed at strengthening global cyber defense, positioning OpenAI as an active participant in the security ecosystem.

🟡 🛡️ Security April 16, 2026 · 3 min read

EleutherAI: New Method Detects Reward Hacking Before It Becomes Visible

EleutherAI has published research on a 'reasoning interpolation' method that detects early signs of reward hacking in reinforcement learning systems. The technique uses importance sampling and fine-tuned donor models to predict future exploit patterns with an AUC of 1.00, while standard methods underestimate exploit rates by 2–5 orders of magnitude.

🟡 🛡️ Security April 16, 2026 · 2 min read

ArXiv: MCPThreatHive — the First Automated Security Platform for the MCP Ecosystem

MCPThreatHive is a new open-source platform that automates the entire threat intelligence lifecycle for Model Context Protocol ecosystems. The platform operationalizes the MCP-38 taxonomy with 38 specific threat patterns, maps them to STRIDE and OWASP frameworks, and includes a system for quantitative risk ranking. It was presented at DEFCON SG 2026.

🟡 🛡️ Security April 16, 2026 · 2 min read

ArXiv: RePAIR Enables LLMs to 'Forget' Targeted Information Without Retraining

RePAIR is a new framework for interactive machine unlearning that enables users to instruct large language models to forget specific information in real time via natural language prompts. The key innovation, the STAMP method, redirects MLP activations toward the refusal subspace using a closed-form formula, without any model retraining, achieving near-zero forgetting scores while preserving model utility.

← Previous day Next day →