AI Agents — what they do and where they break

An agent is an LLM that plans, calls tools, and iterates until a goal is reached. Powerful for multi-step work but brittle — know when to trust one.

Easy Technical

1 min read

An agent is an LLM that has tools (functions it can call) and autonomy (it decides which tools to use, in what order, until it completes the goal). Instead of one-shot 'summarize this', it's 'find the 3 most-complained-about product features from the last 2 weeks' — the agent searches tickets, groups them, counts, returns.

Concretely, an agent loop is: the LLM receives the goal → decides 'I need to call tool X with arguments Y' → your code executes the tool and returns the result → LLM reads the result, decides the next step → repeat until the goal is reached.

Where agents shine: research (aggregate info from multiple sources), troubleshooting (read logs, try fixes, check again), complex workflows (book the cheapest flight, waiting for confirmations), code (implement a task by reading code, modifying, testing — Claude Code, Cursor, Devin).

Where agents break: long chains with no checkpoints (after 15 steps without human validation, they can drift off), critical irreversible decisions (sending a batch email, debiting an account, deleting files — always require human approval), tasks requiring absolute precision (bookkeeping to the cent, legal calculations).

Golden rule before giving an agent autonomy: what's the cost of the worst possible action? If it's 'a slightly silly email', OK let it run. If it's 'the company's entire infra deleted', keep human-in-the-loop for every destructive step.

Simple agent patterns: ReAct (Reason + Act — think, action, observation, repeat), Plan-and-execute (first make a plan of steps, then execute each), Reflexion (after failure, self-critique, retry). For most business use cases, a simple tool loop + good system prompt beats overengineered multi-agent setups.

Diagram

Goal from user

Reason (LLM picks next step)

Act (call a tool)

Observe (tool result)

Final answer or abort

Grounded on https://www.anthropic.com/research/building-effective-agents

Next up

Document Automation — invoices, contracts, forms

Turn incoming PDFs / scans / emails into structured data automatically. One of the highest-ROI AI use cases in most companies.