AI Agent

An AI agent is a system in which a large language model drives its own actions: deciding what to do next, calling tools (search, code execution, file access, APIs), observing the results, and looping until a goal is reached. Unlike a chat assistant that produces one response per prompt, an agent runs an autonomous loop.

A simple agent contains: a model, a set of tools (function definitions the model can invoke), a system prompt describing the goal and constraints, and a runtime loop that executes the tool calls and feeds results back to the model.

In 2025–2026 agents became the primary frontier: Anthropic’s Claude with computer use, OpenAI’s Operator and Agent SDK, Google’s Agent2Agent protocol, and frameworks like LangGraph, CrewAI, and AutoGen. Production deployments now include coding assistants (Cursor, Devin, Claude Code), customer support, research assistants, and SRE/DevOps automation.

Agent reliability remains the open problem: error compounds across loop iterations, tools can fail or return ambiguous data, and longer trajectories drift from the goal. Research focuses on better reasoning, structured planning, and improved evaluation.

Sources

See also