Agents can browse websites using a headless Chrome browser. They can navigate pages, click buttons, fill out forms, extract content, and take screenshots, just like you would in a regular browser.
What you can ask agents to do
- "Go to example.com and find their pricing page"
- "Fill out the contact form on this website with my details"
- "Take a screenshot of this dashboard"
- "Log into my account on this site and check my order status"
- "Extract all product names and prices from this page"
How it works
When an agent browses a website, it:
- Opens the page in a headless Chrome instance2. Waits for the page to load, including JavaScript
- Interacts with the page using actions like click, type, and extract
- Returns what it found (text content, screenshots, or extracted data) to the conversation
You see each step happen in real-time in the chat: which URL the agent visited, what it clicked, and what it extracted.
Browser profiles and persistence
Frona uses browser profiles to persist cookies, local storage, and session data across conversations. If an agent logs into a website today, it can still be logged in next week.
There is a default profile the browser will use if you don't specify a profile for the agent. You can also create multiple named profiles to keep different browsing contexts separate. For example, a "Google" profile for your Google account and a "LinkedIn" profile for LinkedIn. Each profile has its own cookies and session state, stored in its own directory on disk.
Profiles are managed as a credential type in the vault system. Create them from the vault management interface.
When the agent needs your help
Some situations require a human touch:
- CAPTCHAs. The agent can't solve them automatically.
- Two-factor authentication. The agent can't enter your 2FA code.
- Complex login flows. OAuth redirects, SSO screens, or unusual form layouts.
When this happens, the agent pauses and gives you a link to a browser debugger session. Click the link, handle the interaction in your browser, and the agent continues from where you left off.
Browsing vs. fetching
| Scenario | Best tool |
|---|---|
| Read an article or documentation page | Web fetch (faster, simpler) |
| Fill out a form or click buttons | Browser automation |
| Navigate a JavaScript-heavy app | Browser automation |
| Extract structured data from a page | Browser automation |
| Quick content retrieval | Web fetch |
| Log into a website | Browser automation |
For simple "read this page" requests, agents typically use web fetch instead. It's faster and doesn't need a full browser. Browser automation is for when interaction is needed.
Tips
- Be specific about what you want. "Go to example.com/pricing and extract the plan names, prices, and features into a table" gets better results than "check out their pricing."
- Mention login steps if needed. If the agent needs to log in first, say so: "Log into my account at example.com and then check order #12345."
- Use screenshots for verification. Ask the agent to take a screenshot if you want to see what it sees.
Requirements
Browser automation requires a running Browserless instance. See Deployment for setup.
Next steps
- Searching & Reading the Web. For simpler content retrieval.
- Sandbox. How network restrictions affect browser access.