Using Tools
This guide explains how tools work from a user’s perspective. What you see, what you can do, and how to make the most of agent capabilities.
What tools look like in chat
Section titled “What tools look like in chat”When an agent decides to use a tool, you see it in the conversation:
- The agent announces what tool it’s using and why
- The tool runs (you might see a loading indicator)
- The result appears in the chat
- The agent uses the result to continue its response
Some tools are fast (web search), while others take longer (browsing a complex website).
Browser tools in action
Section titled “Browser tools in action”When an agent browses a website, you can follow along:
- You see which URLs it visits
- Screenshots show you what the page looks like
- If the agent gets stuck (CAPTCHA, login page), it can give you control
This is useful for tasks like:
- Researching a topic across multiple websites
- Filling out web forms
- Extracting data from websites
- Monitoring web pages
Asking agents to use specific tools
Section titled “Asking agents to use specific tools”You don’t need to name tools directly. Just describe what you want, and the agent picks the right tool:
- “Search the web for…” triggers
web_search - “Go to this website and…” triggers browser tools
- “Remember that I…” triggers
remember - “Run this command…” triggers
cli - “Create a file with…” triggers
produce_file
When tools need your help
Section titled “When tools need your help”Two situations where you’ll be asked to intervene:
Taking over the browser
Section titled “Taking over the browser”If a website requires something the agent can’t handle (CAPTCHA, complex 2FA, unusual interactions), the agent uses request_user_takeover. You get a link to a live browser session where you can interact with the page directly. When you’re done, let the agent know and it continues.
Answering questions
Section titled “Answering questions”The agent might need your input to proceed. It presents options or asks a yes/no question. Pick your answer and the conversation continues.
Tool limitations
Section titled “Tool limitations”- Browser sessions are limited by the number of concurrent slots (default: 10)
- CLI commands run in a sandbox with filesystem and network restrictions
- Web search results depend on the configured search provider
- Voice calls require Twilio configuration
- Tool turns are capped per response (default: 200) to prevent runaway loops