Agents can run shell commands in a sandboxed environment. This lets them install packages, process data, interact with APIs, manage files, and run any command-line tool available on the server.
What you can ask agents to do
- "Run
git log --oneline -10and show me the recent commits" - "Install pandas and process this CSV file"
- "Use curl to check if this API endpoint is responding"
- "Compress all the files in my workspace into a zip"
- "Run this build script and show me the output"
How it works
When an agent runs a shell command:
- The command executes in
/bin/bashwithin the agent's workspace directory - The output (stdout and stderr) is captured
- The result is returned to the agent, which uses it in its response
You see the command and its output inline in the chat.
Sandbox
All shell commands run inside the agent's sandbox. The sandbox controls which files the agent can access, which network destinations it can reach, and how long commands can run. You configure these restrictions per agent in the agent settings.
When to use shell vs. Python vs. Node.js
| Scenario | Best tool |
|---|---|
| System commands (ls, curl, git, apt) | Shell |
| Installing packages | Shell |
| Data processing and analysis | Python |
| File parsing (CSV, JSON, XML) | Python |
| Math and calculations | Python |
| API integrations, async operations | Node.js |
| Web scraping with libraries | Python or Node.js |
Tips
- Be specific about what you want. "Run
df -hand tell me how much disk space is free" is clearer than "check disk space." - Large outputs get truncated. If a command produces a lot of output, the agent sees a truncated version. Ask it to pipe output to a file for full results.
- Agents chain commands. A single request like "install dependencies and run the tests" may result in multiple shell commands in sequence.
Next steps
- Running Python Code. For data processing and scripting.
- Running Node.js Code. For JavaScript execution.
- Sandbox. Understand security restrictions.