Skip to content

Agents can run shell commands in a sandboxed environment. This lets them install packages, process data, interact with APIs, manage files, and run any command-line tool available on the server.

What you can ask agents to do

  • "Run git log --oneline -10 and show me the recent commits"
  • "Install pandas and process this CSV file"
  • "Use curl to check if this API endpoint is responding"
  • "Compress all the files in my workspace into a zip"
  • "Run this build script and show me the output"

How it works

When an agent runs a shell command:

  1. The command executes in /bin/bash within the agent's workspace directory
  2. The output (stdout and stderr) is captured
  3. The result is returned to the agent, which uses it in its response

You see the command and its output inline in the chat.

Sandbox

All shell commands run inside the agent's sandbox. The sandbox controls which files the agent can access, which network destinations it can reach, and how long commands can run. You configure these restrictions per agent in the agent settings.

When to use shell vs. Python vs. Node.js

ScenarioBest tool
System commands (ls, curl, git, apt)Shell
Installing packagesShell
Data processing and analysisPython
File parsing (CSV, JSON, XML)Python
Math and calculationsPython
API integrations, async operationsNode.js
Web scraping with librariesPython or Node.js

Tips

  • Be specific about what you want. "Run df -h and tell me how much disk space is free" is clearer than "check disk space."
  • Large outputs get truncated. If a command produces a lot of output, the agent sees a truncated version. Ask it to pipe output to a file for full results.
  • Agents chain commands. A single request like "install dependencies and run the tests" may result in multiple shell commands in sequence.

Next steps