Browser automation for Jido AI agents.
Jido.Browser is organized around three simple lanes:
web_fetch/2for stateless HTTP-first retrievalstart_session/1andend_session/1for browser-backed workflowsJido.Browser.Poolplusstart_session(pool: ...)as an optional acceleration layer
agent-browser remains the default adapter. Web also supports warm pools when
you want browser-backed sessions with lower cold-start overhead. Vibium
remains available without warm-pool support.
The Hex package and OTP app remain jido_browser, while the public Elixir namespace is Jido.Browser.*.
Add the dependency:
def deps do
[
{:jido_browser, "~> 2.0"}
]
endInstall the default browser backend:
mix jido_browser.installThat installs the pinned agent-browser binary for the current platform and runs agent-browser install to provision the browser runtime.
defp aliases do
[
setup: ["deps.get", "jido_browser.install --if-missing"],
test: ["jido_browser.install --if-missing", "test"]
]
endmix jido_browser.install agent_browser
mix jido_browser.install vibium
mix jido_browser.install web{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, snapshot} = Jido.Browser.snapshot(session)
snapshot["snapshot"] || snapshot[:snapshot]
{:ok, session, _} = Jido.Browser.click(session, "@e1")
{:ok, _session, %{content: markdown}} = Jido.Browser.extract_content(session, format: :markdown)
:ok = Jido.Browser.end_session(session)Selectors remain supported, but ref-based interaction is the preferred 2.0 flow:
snapshot- act on
@eNrefs - re-snapshot
{:ok, result} =
Jido.Browser.web_fetch(
"https://example.com/docs",
format: :markdown,
allowed_domains: ["example.com"],
focus_terms: ["API", "authentication"],
citations: true
)
result.content
result.passages
result.metadata # present when extraction returns document metadataweb_fetch/2 keeps HTML handling native for selector extraction and markdown conversion, and uses extractous_ex for fetched binary documents such as PDFs, Word, Excel, PowerPoint, OpenDocument, EPUB, and common email formats. Binary document responses may also include result.metadata when extraction returns document metadata.
state_path = Path.expand("tmp/browser-state.json")
File.mkdir_p!(Path.dirname(state_path))
{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.save_state(session, state_path)
:ok = Jido.Browser.end_session(session)
{:ok, restored} = Jido.Browser.start_session()
{:ok, restored, _} = Jido.Browser.load_state(restored, state_path){:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.new_tab(session, "https://example.org")
{:ok, session, tabs} = Jido.Browser.list_tabs(session)
{:ok, session, _} = Jido.Browser.switch_tab(session, 1)
{:ok, session, _} = Jido.Browser.close_tab(session, 1)Warm pools are explicit and optional. They speed up browser-backed workflows,
while web_fetch/2 stays stateless and never uses pools.
For OTP applications, prefer adding a named pool to your supervision tree:
defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [
{Jido.Browser.Pool,
name: :default,
size: 2,
headless: true,
startup_timeout: 60_000}
]
Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
endThen check out pooled sessions by name:
{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)Use start_pool/1 for scripts, tests, or ad hoc startup:
{:ok, _pool} =
Jido.Browser.start_pool(
name: :default,
size: 2,
headless: true
)
{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)Warm pools are currently supported by Jido.Browser.Adapters.AgentBrowser and
Jido.Browser.Adapters.Web.
- AgentBrowser pools keep full warm daemon-backed sessions ready for checkout.
- Web pools keep reserved warmed profiles ready for checkout.
end_session/1always recycles the checked-out worker and warms a replacement in the background.
For the Web adapter, pooled sessions are still browser sessions, not HTTP
fetches. Use web_fetch/2 when you want the simplest request/response API
without browser state.
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "browser_agent",
plugins: [
{Jido.Browser.Plugin,
[
adapter: Jido.Browser.Adapters.AgentBrowser,
pool: :default,
checkout_timeout: 5_000,
headless: true,
timeout: 30_000
]}
]
endconfig :jido_browser,
adapter: Jido.Browser.Adapters.AgentBrowser
config :jido_browser, :agent_browser,
binary_path: "/usr/local/bin/agent-browser",
headed: falseOther adapters can still be configured explicitly:
config :jido_browser, :vibium,
binary_path: "/path/to/vibium"
config :jido_browser, :web,
binary_path: "/usr/local/bin/web",
profile: "default"Optional web fetch settings:
config :jido_browser, :web_fetch,
cache_ttl_ms: 300_000,
extractous: [
pdf: [extract_annotation_text: true],
office: [include_headers_and_footers: true]
]Configured extractous options are merged with any per-call extractous: keyword options passed to Jido.Browser.web_fetch/2.
- native snapshot support with refs
- supervised daemon per session
- optional warm session pools with explicit checkout
- direct JSON IPC from Elixir
- built-in state save/load and tab management support
- retained for transitional compatibility
- feature-frozen in 2.0
- retained for transitional compatibility
- feature-frozen in 2.0
Core operations:
start_pool/1stop_pool/1start_session/1end_session/1navigate/3click/3type/4screenshot/2extract_content/2web_fetch/2evaluate/3
Agent-browser-native operations:
snapshot/2wait_for_selector/3wait_for_navigation/2query/3get_text/3get_attribute/4is_visible/3save_state/3load_state/3list_tabs/2new_tab/3switch_tab/3close_tab/3console/2errors/2
StartSessionEndSessionGetStatusSaveStateLoadState
NavigateBackForwardReloadGetUrlGetTitle
ClickTypeHoverFocusScrollSelectOption
WaitWaitForSelectorWaitForNavigationQueryGetTextGetAttributeIsVisible
SnapshotScreenshotExtractContentConsoleErrors
ListTabsNewTabSwitchTabCloseTab
EvaluateReadPageSnapshotUrlSearchWebWebFetch
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "web_browser",
description: "An agent that can browse the web",
plugins: [{Jido.Browser.Plugin, [headless: true]}]
endJido.Browser.Plugin now exposes 37 browser actions, including snapshot/refs workflows, browser state actions, diagnostics, and tab management.
Apache-2.0 - See LICENSE for details.