Tool / function calling: the model calling the world
How the model stops merely talking and starts ACTING — the basis of every agent.
6 min read
Until now the model only talks — it generates text, answers questions, summarizes documents. But real applications need it to act: query a database, call an API, run a calculation. Tool calling is the mechanism that turns language into action, and it is the foundation on which every AI agent is built.
The problem: the model lives inside a text box
An LLM, on its own, is a pure function: it receives tokens and returns tokens. It has no clock, no internet access, no connection to your database, no knowledge of what happened yesterday. Everything it "knows" was fixed during training — and training has a knowledge cutoff.
This creates a clear boundary: the model can reason very well about the world, but it cannot observe it in real time or modify it. For a production application — an assistant that checks inventory, an agent that creates tickets, a chatbot that queries customer history — this is a fundamental blocker.
The obvious escape would be to let the model run arbitrary code. But that is dangerous and uncontrollable. The elegant solution is different: you define a set of tools with explicit contracts, and the model learns to request their use — without executing anything directly. Your code does the executing. The model only declares the intent.
How tool calling works: intent, execution, and return
The flow has three actors: the model, the runtime (your code), and the tool (any external system). Here is the full cycle:
- You describe the tools in the system prompt — name, natural-language description, and a JSON schema for the parameters. The model sees no code; it sees a contract.
- The user asks a question that requires external data: "What is the current balance of account 42?"
- The model decides it needs a tool and emits a structured JSON block — not a prose response, but a call intent:
{ "tool": "get_account_balance", "parameters": { "account_id": "42" } }. - The runtime intercepts that intent, validates the parameters, calls the real system (database, API, service), and captures the result.
- The result is returned to the model as a new message in the context. The model reads the result and generates the final response for the user.
The critical point: the model never executes anything. It produces structured text that looks like a function call. Who decides whether to execute, with which permissions, and in which environment is always your runtime. This design is what makes the mechanism safe by construction — as long as you respect the principle of least privilege, which we cover in Lesson 10.
Full tool calling cycle
A single tool calling step — not an agent yet, just the atomic unit of action.
- System Prompt · + tool schemas
- LLM · inferência / inference
- Tool Intent · { JSON estruturado }
- Runtime · valida + despacha
- Autorização · menor privilégio
- API / Serviço · externo
- Banco de Dados · DB / Storage
- Resultado · da ferramenta
In practice, the most common mistake I see is treating the model's intent JSON as if it were a verified function call. It is not. The model can hallucinate parameters, pick the wrong tool, or be manipulated via prompt injection into calling something it shouldn't. Your runtime needs to validate the schema, check permissions, and — for destructive tools (DELETE, financial transfer) — require explicit confirmation before executing. The model suggests; the runtime decides.
Order one tool-calling step
How a single tool call happens.
- 1You describe the tools (name, description, schema)
- 2Your runtime executes the real tool
- 3The model decides to call one and emits the arguments as JSON
- 4The result returns to the model, which continues reasoning
The bridge between language and action — and what comes next
Tool calling solves an interface problem: how to make a system that reasons in natural language interact with systems that speak in APIs and schemas. The answer is elegant — you use the model itself to translate human intent into structured calls, without training a separate classifier for each tool.
This has deep implications. First, it is composable: you can add or remove tools without retraining the model — you only change the contract in the prompt. Second, it is auditable: every intent is an explicit JSON you can log, validate, and inspect. Third, it is the atomic unit of the agent: what we call an "agent" in Lesson 11 is basically a loop that repeats this cycle — the model calls a tool, receives the result, decides whether it needs more information or can already respond.
One important detail that separates tool calling from RAG (Lesson 06): RAG injects context before the model responds — it is a passive retrieval. Tool calling is an action during generation — the model pauses, requests data, and continues. They are complementary: many systems use RAG to retrieve documents and tool calling to fetch real-time data or execute actions with side effects.
The tool description matters as much as its code. If the description is vague, the model will choose the wrong tool or fill in incorrect parameters. Treat each tool's schema like a public API: name it well, document the parameters, and indicate what the tool does not do.
Key takeaways from this lesson
How to define a good tool
- 1
Unambiguous name
Use clear verbs and nouns:
get_order_status, notcheckorquery. The model uses the name to decide when to call. - 2
Honest scope description
Say what the tool does AND what it does not do. "Returns order status by ID. Does not return payment history." This prevents wrong calls.
- 3
Typed parameter schema
Use JSON Schema with types, required fields, and per-field descriptions. The model fills parameters better when it knows the expected type and what each field means.
- 4
Predictable and concise result
Return only what the model needs to continue reasoning. Large payloads consume context and increase cost. Filter in the runtime, not in the model.
- 5
Minimum permissions in the runtime
Each tool should have access only to what it needs. A read tool should not have write permission. This limits the blast radius if the model is manipulated.
Frequently asked questions
Can the model call multiple tools at the same time?
It depends on the model and implementation. Some models support parallel tool calls in a single response — the runtime executes all of them and returns the results together. Others call one at a time. Check the documentation for the model you are using.
What is the difference between tool calling and function calling?
They are the same concept with different names. OpenAI popularized "function calling"; the industry converged on "tool calling" because a "tool" can be more than a function — it can be a service, an API, a subordinate agent. In practice, the mechanism is identical.
What happens if the model fills in a wrong parameter?
Your runtime should validate the JSON against the schema before executing. If validation fails, return the error to the model as the tool result — it can usually self-correct on the next iteration. Never execute with invalid parameters.
How many tools can I define?
Technically many, but there is a cost: each tool schema consumes context tokens and increases the chance of the model choosing wrong. In systems with dozens of tools, consider loading only the ones relevant to the current context — a technique called tool routing or dynamic tool selection.
Quick check
1. In tool calling, who executes the tool?