Skip to content

Streaming

Agent responses stream to the client in real-time using Server-Sent Events (SSE). You see tokens appear as the agent generates them, and tool calls are visible as they happen.

When you send a message, the backend opens an SSE connection and starts streaming events. The frontend renders these events as they arrive, giving you a live view of the agent’s thought process.

The stream sends different event types as the agent works:

EventDescription
tokenA chunk of the agent’s text response
tool_callThe agent is calling a tool (includes tool name and arguments)
tool_resultThe result of a tool execution
tool_messageAn intermediate message during tool processing
tool_resolvedA pending tool interaction was resolved (e.g., user answered a question)
entity_updatedA data entity was modified by a tool
doneThe agent finished its response
stream_endThe SSE connection is closing

Text responses arrive as a series of token events. Each event contains a small chunk of text (typically a few tokens). The frontend concatenates these to build the complete response.

When an agent decides to use a tool, you see:

  1. A tool_call event with the tool name and arguments
  2. The tool executes server-side
  3. A tool_result event with the output
  4. The agent continues generating its response using the tool result

If the agent calls multiple tools in sequence, you see each call/result pair before the next one starts.

Some tools create interactive moments:

  • Human-in-the-loop (notify_human). The agent pauses and gives you a debugger URL to take manual action (e.g., solving a CAPTCHA). The tool_resolved event fires when you’re done.
  • Questions (ask_user_question). The agent asks you a question with options. Your answer triggers a tool_resolved event.
  • User takeover (request_user_takeover). You get direct control of a browser session.

The SSE connection stays open for the duration of the response. If the connection drops, the response continues server-side and is persisted to the database. You can reload the chat to see the complete response.

The agent can make up to max_tool_turns tool calls per response (default: 200). This prevents runaway tool loops. If the limit is reached, the agent wraps up its response with whatever information it has gathered.