Toolformer
A model learns to call tools through self-supervised API-use examples.
Toolformer teaches a model to decide when to call external tools by generating its own tool-use training examples.
It points toward LLMs as coordinators of calculators, search, translation, calendars, and APIs.
Toolformer teaches the model the decision to use a tool, not just the syntax of using one.
The quick digest
The model starts with ordinary text and tries inserting API calls. If a call helps predict the continuation, that example is kept. Over many examples, the model learns not just how to call a tool, but when the call is worth making.
This is different from simply giving a model an API. Toolformer makes tool use part of the model’s learned behavior. It can learn that arithmetic should use a calculator, factual questions might use search, and some tasks need no tool at all.
The paper is an early bridge from language modeling to tool-using agents. Modern function calling is more structured, but the central question is the same: when should the model stop talking and use another system?
What to remember
Read it like this
- First pass: Study the self-labeling pipeline.
- Second pass: Then ask what makes a tool call useful enough to keep.
- Then build taste: Compare with ReAct and modern function calling.
Create examples where a model should call a calculator or retrieval tool, then evaluate unnecessary versus useful calls.