Function Calling in LLMs Explained (2026)

Function calling (also called "tool use") is what turns a chatbot into an agent. Instead of just generating text, the LLM can decide to call your code — search a database, send an email, check the weather, or do anything your functions can do. Here's how it works.

The Core Idea

Without function calling:
  User: "What's the weather in Tokyo?"
  LLM:  "I don't have access to real-time weather data." ← useless

With function calling:
  User: "What's the weather in Tokyo?"
  LLM:  Decides to call: get_weather(city="Tokyo")
  Your code: Fetches weather → returns "Tokyo: 15°C, cloudy"
  LLM:  "It's currently 15°C and cloudy in Tokyo." ← actually helpful

The LLM doesn't execute code. It outputs a structured request ("please call this function with these arguments"), your application executes it, and you send the result back.

How It Works (Step by Step)

Step 1: You define available functions (tools)
  → "Here are functions the LLM can use: get_weather, search_db, send_email"
  → Each function has a name, description, and parameter schema

Step 2: User sends a message
  → "What's the weather in Tokyo?"

Step 3: LLM decides whether to use a function
  → Analyzes the message
  → Decides: "I need get_weather to answer this"
  → Returns: { function: "get_weather", args: { city: "Tokyo" } }
  → (This is NOT regular text — it's a structured tool call)

Step 4: Your code executes the function
  → You call your actual get_weather("Tokyo") function
  → Returns: "15°C, cloudy, humidity 72%"

Step 5: You send the result back to the LLM
  → LLM receives the function output
  → Generates a natural language response
  → "It's currently 15°C and cloudy in Tokyo with 72% humidity."

Practical Example (OpenAI)

import OpenAI from 'openai'

const openai = new OpenAI()

// Step 1: Define your tools
const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get the current weather for a city',
      parameters: {
        type: 'object',
        properties: {
          city: {
            type: 'string',
            description: 'City name, e.g. "Tokyo" or "New York"'
          },
          units: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
            description: 'Temperature units'
          }
        },
        required: ['city']
      }
    }
  }
]

// Step 2: Send message with tools
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: "What's the weather in Tokyo?" }],
  tools,
})

// Step 3: Check if the LLM wants to call a function
const message = response.choices[0].message

if (message.tool_calls) {
  for (const toolCall of message.tool_calls) {
    const args = JSON.parse(toolCall.function.arguments)
    
    // Step 4: Execute the function
    let result
    if (toolCall.function.name === 'get_weather') {
      result = await fetchWeather(args.city, args.units)
    }

    // Step 5: Send result back
    const finalResponse = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [
        { role: 'user', content: "What's the weather in Tokyo?" },
        message, // The LLM's tool call
        {
          role: 'tool',
          tool_call_id: toolCall.id,
          content: JSON.stringify(result),
        },
      ],
      tools,
    })
    
    console.log(finalResponse.choices[0].message.content)
    // "It's currently 15°C and cloudy in Tokyo."
  }
}

Practical Example (Anthropic Claude)

import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic()

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  tools: [
    {
      name: 'get_weather',
      description: 'Get the current weather for a city',
      input_schema: {
        type: 'object',
        properties: {
          city: { type: 'string', description: 'City name' },
        },
        required: ['city'],
      },
    },
  ],
  messages: [{ role: 'user', content: "What's the weather in Tokyo?" }],
})

// Claude returns tool_use blocks
for (const block of response.content) {
  if (block.type === 'tool_use') {
    const result = await fetchWeather(block.input.city)
    
    // Send result back
    const finalResponse = await anthropic.messages.create({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 1024,
      tools: [/* same tools */],
      messages: [
        { role: 'user', content: "What's the weather in Tokyo?" },
        { role: 'assistant', content: response.content },
        {
          role: 'user',
          content: [{
            type: 'tool_result',
            tool_use_id: block.id,
            content: JSON.stringify(result),
          }],
        },
      ],
    })
  }
}

Common Patterns

Pattern 1: Multiple Tool Calls

LLMs can call multiple functions in one turn:

User: "Compare the weather in Tokyo and New York"

LLM calls (parallel):
  → get_weather(city="Tokyo")
  → get_weather(city="New York")

Results:
  → Tokyo: 15°C, cloudy
  → New York: 22°C, sunny

LLM: "Tokyo is 15°C and cloudy while New York is warmer at 22°C and sunny."

Pattern 2: Chained Tool Calls

One tool's output informs the next call:

User: "Find our biggest customer and check their latest order status"

Turn 1 — LLM calls: get_top_customer()
  → Returns: { id: "cust_123", name: "Acme Corp" }

Turn 2 — LLM calls: get_latest_order(customer_id="cust_123")
  → Returns: { order_id: "ord_456", status: "shipped", tracking: "1Z999" }

LLM: "Your biggest customer is Acme Corp. Their latest order (ord_456)
has been shipped with tracking number 1Z999."

Pattern 3: Conditional Logic

The LLM decides whether to use tools based on the question:

"What is 2+2?"         → No tool needed, LLM answers directly: "4"
"What's AAPL stock?"   → Calls get_stock_price(symbol="AAPL")
"Email Bob the report" → Calls send_email(to="bob@...", body="...")

Writing Good Tool Descriptions

The description is how the LLM decides when to use each tool:

// ❌ Bad — vague, LLM won't know when to use it
{
  name: 'lookup',
  description: 'Look up stuff'
}

// ✅ Good — clear when to use, what it returns
{
  name: 'search_products',
  description: 'Search the product catalog by name, category, or price range. Returns product names, prices, and availability. Use when the user asks about products, pricing, or inventory.'
}

// ✅ Great — includes examples
{
  name: 'query_sales',
  description: 'Query sales data from the database. Accepts date ranges and filters. Examples: total revenue this month, top selling products, sales by region. Returns structured data with amounts in USD.'
}

Production Best Practices

1. Validate Tool Arguments

// LLMs sometimes hallucinate arguments
if (toolCall.function.name === 'send_email') {
  const args = JSON.parse(toolCall.function.arguments)
  
  // Validate email format
  if (!isValidEmail(args.to)) {
    return { error: 'Invalid email address' }
  }
  
  // Require confirmation for destructive actions
  if (args.to.includes('@external.com')) {
    return { error: 'External emails require user confirmation' }
  }
}

2. Limit Available Tools

5 tools:   LLM picks the right one 95% of the time
15 tools:  LLM picks correctly 85% of the time
50 tools:  LLM gets confused, picks wrong tools, makes up arguments

Only expose tools relevant to the current context.

3. Handle Errors Gracefully

try {
  result = await executeFunction(toolCall)
} catch (error) {
  // Return error to LLM — it can retry or explain the issue
  result = { error: `Function failed: ${error.message}` }
}

4. Set Timeouts

// Don't let tool execution hang forever
const result = await Promise.race([
  executeFunction(toolCall),
  new Promise((_, reject) =>
    setTimeout(() => reject(new Error('Tool execution timed out')), 30000)
  ),
])

Supported Models

Model	Function Calling	Parallel Calls	Structured Output
GPT-4o	✅ Excellent	✅	✅
Claude 3.5 Sonnet	✅ Excellent	✅	✅
Gemini 1.5 Pro	✅ Good	✅	✅
Llama 3.1 70B+	✅ Good	⚠️ Limited	✅
Mistral Large	✅ Good	✅	✅

FAQ

Is function calling the same as plugins?

Plugins (like ChatGPT plugins) are built on function calling. They're a specific implementation where tools are defined by third-party services.

Can function calling access the internet?

The LLM itself doesn't access anything. Your functions do. If you provide a web_search function, the LLM can request searches, but your code does the actual fetching.

Is function calling safe?

As safe as your implementation. The LLM suggests actions — your code executes them. Add validation, permissions, and confirmation steps for sensitive operations.

What's the difference between function calling and MCP?

MCP (Model Context Protocol) standardizes how tools are described and connected. Function calling is the LLM capability; MCP is the protocol for tool discovery and communication.

Bottom Line

Function calling turns LLMs from text generators into capable agents. Define tools, let the LLM decide when to use them, execute the calls, return results. Start with 2-3 simple tools and expand as you see what users need.

The pattern is simple: you provide the tools, the LLM provides the judgment.