← Back to articles

How to Build an AI Agent (2026)

An AI agent is software that uses an LLM to plan, make decisions, and take actions to achieve a goal. Unlike a chatbot that just responds, an agent actually does things — browses the web, writes files, calls APIs, and iterates until the task is done. Here's how to build one.

What Makes an Agent

An AI agent has four components:

1. LLM (Brain)        — decides what to do
2. Tools (Hands)       — takes actions in the world
3. Memory (Context)    — remembers what happened
4. Loop (Persistence)  — keeps going until the goal is met

The Agent Loop

Goal received
    ↓
LLM decides next action
    ↓
Tool executes the action
    ↓
Result returned to LLM
    ↓
LLM evaluates: goal met?
    ↓
  No → decide next action (loop back)
  Yes → return final result

This loop is the fundamental difference between a chatbot (one response) and an agent (continuous action until done).

Level 1: Simple Agent (30 minutes)

The ReAct Pattern

The simplest agent pattern: Reason, Act, Observe.

Architecture:

  1. Give the LLM a system prompt with available tools
  2. LLM reasons about what to do
  3. LLM chooses a tool and provides arguments
  4. Your code executes the tool
  5. Result is fed back to the LLM
  6. Repeat until the LLM says "done"

Example: A Research Agent

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "web_search",
        "description": "Search the web for information",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "save_note",
        "description": "Save a research finding",
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "content": {"type": "string"}
            },
            "required": ["title", "content"]
        }
    }
]

def run_agent(goal):
    messages = [{"role": "user", "content": goal}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system="You are a research agent. Use tools to research topics and save findings.",
            tools=tools,
            messages=messages
        )
        
        # Check if the agent wants to use a tool
        if response.stop_reason == "tool_use":
            # Execute the tool
            tool_results = execute_tools(response.content)
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})
        else:
            # Agent is done
            return response.content

def execute_tools(content):
    results = []
    for block in content:
        if block.type == "tool_use":
            if block.name == "web_search":
                result = search_web(block.input["query"])
            elif block.name == "save_note":
                result = save_note(block.input["title"], block.input["content"])
            results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": result
            })
    return results

That's a working agent. It receives a goal, decides which tools to use, executes them, evaluates results, and continues until done.

Level 2: Agent with Memory (1-2 hours)

Adding Persistent Memory

Level 1 agents forget everything between runs. Adding memory makes agents that learn and remember.

Types of memory:

  • Short-term (conversation): The current message history. Already present in Level 1.
  • Long-term (persistent): Saved to a file or database between runs. Agent remembers past interactions.
  • Working memory (scratchpad): Notes the agent writes to itself during a task.

Implementation:

import json

class AgentMemory:
    def __init__(self, path="memory.json"):
        self.path = path
        self.load()
    
    def load(self):
        try:
            with open(self.path) as f:
                self.data = json.load(f)
        except FileNotFoundError:
            self.data = {"facts": [], "preferences": [], "history": []}
    
    def save(self):
        with open(self.path, "w") as f:
            json.dump(self.data, f, indent=2)
    
    def add_fact(self, fact):
        self.data["facts"].append(fact)
        self.save()
    
    def get_context(self):
        return f"Known facts: {json.dumps(self.data['facts'][-20:])}"

Add memory.get_context() to your system prompt so the agent knows what it remembers.

Level 3: Multi-Tool Agent (2-4 hours)

Adding Real Tools

Real agents need real tools. Common tools to implement:

tools_registry = {
    "web_search": search_brave_api,        # Search the web
    "read_url": fetch_and_parse_url,       # Read a webpage
    "write_file": write_to_disk,           # Save files
    "read_file": read_from_disk,           # Read files
    "run_code": execute_python,            # Run Python code
    "send_email": send_via_api,            # Send emails
    "query_database": run_sql_query,       # Query a database
    "call_api": make_http_request,         # Call any API
}

Security consideration: Every tool is a potential attack surface. An agent with run_code can execute arbitrary code. An agent with send_email can email anyone. Implement guardrails:

def run_code(code):
    # Sandbox: run in Docker container with no network
    # Timeout: kill after 30 seconds
    # Review: log every execution
    pass

def send_email(to, subject, body):
    # Allowlist: only send to approved domains
    # Rate limit: max 5 emails per hour
    # Log: record every email sent
    pass

Level 4: Using a Framework (Fastest)

LangChain

The most popular agent framework. Handles the agent loop, tool management, and memory.

from langchain.agents import create_react_agent
from langchain_anthropic import ChatAnthropic
from langchain.tools import Tool

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

tools = [
    Tool(name="search", func=search_web, description="Search the web"),
    Tool(name="calculator", func=calculate, description="Do math"),
]

agent = create_react_agent(llm, tools, prompt)
result = agent.invoke({"input": "Research the market size for AI tools"})

Pros: Fast to build, large ecosystem, many pre-built tools. Cons: Abstraction can be opaque, debugging is harder, dependency heavy.

CrewAI

Multi-agent framework — multiple specialized agents working together.

from crewai import Agent, Task, Crew

researcher = Agent(
    role="Research Analyst",
    goal="Research market data and trends",
    tools=[search_tool, web_reader],
    llm=claude
)

writer = Agent(
    role="Report Writer", 
    goal="Write clear, actionable reports",
    tools=[file_writer],
    llm=claude
)

research_task = Task(
    description="Research the AI tools market size and growth",
    agent=researcher
)

report_task = Task(
    description="Write a market report from the research findings",
    agent=writer
)

crew = Crew(agents=[researcher, writer], tasks=[research_task, report_task])
result = crew.kickoff()

Best for: Complex workflows where different "experts" handle different parts.

Vercel AI SDK

Build AI agents in TypeScript/Next.js.

import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

const result = await generateText({
  model: anthropic('claude-sonnet-4-20250514'),
  tools: {
    search: tool({
      description: 'Search the web',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => searchWeb(query),
    }),
  },
  maxSteps: 10,
  prompt: 'Research the top AI coding tools and summarize findings',
});

Best for: TypeScript/Next.js developers building web-based agents.

Architecture Decisions

Which LLM?

ModelBest ForCost
Claude SonnetBest reasoning, tool use$3/$15 per 1M tokens
GPT-4oStrong all-around$2.50/$10 per 1M tokens
Claude HaikuFast, cheap tasks$0.25/$1.25 per 1M tokens
GPT-4o MiniBudget tasks$0.15/$0.60 per 1M tokens
Local (Llama 3.1)Privacy, no API costFree (hardware cost)

Recommendation: Claude Sonnet for complex agents. GPT-4o Mini or Haiku for simple, high-volume tasks.

Framework vs Custom?

ApproachWhen to Use
Custom (no framework)Simple agents, learning, full control
LangChainRapid prototyping, large tool ecosystem
CrewAIMulti-agent workflows
Vercel AI SDKTypeScript web applications
Semantic Kernel.NET/C# applications

Start custom. Build your first agent from scratch to understand the fundamentals. Use a framework for your second agent when you understand what the framework is abstracting.

Common Agent Patterns

Tool Selection Agent

Agent decides which tool to use based on the task. The most basic pattern.

Pipeline Agent

Agent executes steps in a fixed order: research → analyze → write → review. Each step uses different tools.

Supervisor Agent

One agent coordinates multiple sub-agents. Assigns tasks, collects results, synthesizes output. Used in CrewAI and multi-agent systems.

Reflection Agent

Agent generates output, then critiques its own output, then improves it. Produces higher quality results at the cost of more LLM calls.

Common Mistakes

  1. No error handling. Tools fail. APIs timeout. Files don't exist. Always handle errors gracefully and let the agent retry or adapt.
  2. Infinite loops. Set a maximum number of steps (10-20). If the agent hasn't completed the task, stop and report what happened.
  3. Too many tools. Each tool adds complexity and confusion for the LLM. Start with 3-5 tools. Add more only when needed.
  4. No logging. Log every LLM call, tool execution, and decision. When something goes wrong, you need the trace.
  5. Overly broad goals. "Improve our marketing" is too vague. "Research 5 competitors' pricing pages and create a comparison table" is specific enough for an agent.

FAQ

How much does it cost to run an agent?

A typical agent task (10-20 LLM calls with tool use) costs $0.10-0.50 with Claude Sonnet. Complex tasks with many iterations: $1-5. Budget $50-200/month for moderate agent usage.

Can agents run unsupervised?

For low-risk tasks (research, file organization, data processing): yes, with proper guardrails. For high-risk tasks (sending emails, modifying production data, spending money): always require human approval.

Which language should I build agents in?

Python (most resources, LangChain ecosystem) or TypeScript (web integration, Vercel AI SDK). Both are well-supported by all LLM providers.

How do I debug agents?

Log everything. Print the LLM's reasoning at each step. Record tool inputs and outputs. When the agent goes wrong, read the trace to find where its reasoning diverged.

Can I build agents with local models?

Yes. Llama 3.1 70B handles tool use well. Smaller models (8B) struggle with complex multi-step reasoning. Use Ollama or vLLM to serve local models with an OpenAI-compatible API.

Bottom Line

Building an AI agent is simpler than it sounds. The core pattern — LLM decides, tool executes, loop until done — can be implemented in 50 lines of code.

Start here: Build a simple research agent (Level 1) using Claude's tool use API. Give it a web search tool and a note-saving tool. Have it research a topic and save findings. That's your first agent.

Scale from there: Add memory (Level 2), more tools (Level 3), and frameworks (Level 4) as your needs grow. The fundamentals don't change — everything is built on the same decide-act-observe loop.

Get AI tool guides in your inbox

Weekly deep-dives on the best AI coding tools, automation platforms, and productivity software.