From 0d26a3fd43d3b2147ddf97afe8b7d00a4c1d9d80 Mon Sep 17 00:00:00 2001
From: Bruno Azoulay <info@fusengine.ch>
Date: Thu, 11 Jun 2026 13:13:05 +0200
Subject: [PATCH] docs: sync README + docs to the real surface (44 MCP tools,
 15 CLI commands)

37 to 44 MCP tools (matches registry.ts + mcp.test.ts); add browser_products + browser_autoscroll to mcp-tools.md; document FUSE_CAPS + FUSE_NETLOG_MAX; cli.md 9 to 15 commands.
---
 README.md             | 15 ++++++++++-----
 docs/README.md        |  4 ++--
 docs/cli.md           |  2 +-
 docs/configuration.md |  2 ++
 docs/mcp-tools.md     | 44 +++++++++++++++++++++++++++++++++++++++----
 5 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index ca676fb..aeb5d35 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ Shadow DOM + iframes), multi-step plans, structured extraction, visual diff, and
 guardrails** for payments and bookings. It drives real Chromium, so it reads **Next.js / SPA**
 pages after hydration — not just static HTML.
 
-> 37 MCP tools · stealth + rotating proxies · HTTP fast-path (single, batch & crawl) · full-site content + screenshot snapshots · virtualized-list scraping · HAR record/replay · pixel visual-diff · human handoff + live view.
+> 44 MCP tools · stealth + rotating proxies · HTTP fast-path (single, batch & crawl) · full-site content + screenshot snapshots · structured per-card product extraction · virtualized-list scraping + autoscroll · tabs / dialogs / downloads · console + network logs · MCP screenshot resources · `FUSE_CAPS` tool-group filtering · named auth profiles · `blockResources` · HAR record/replay · pixel visual-diff · human handoff + live view.
 
 ## Install
 
@@ -37,13 +37,15 @@ Prefer a terminal? Install the CLI: `npm i -g @fusengine/browser-mcp`
 ```bash
 fuse-browser probe https://example.com --extract-prices
 fuse-browser fetch https://books.toscrape.com/ --extract-prices   # no browser, ~10× faster
+fuse-browser products "https://www.digitec.ch/en/search?q=macbook" --limit 20   # structured cards → sort to find the cheapest
 ```
 
 ## How it works
 
 An LLM runs a **perceive → decide → act** loop through the tools: `browser_open` →
 `browser_navigate` → `browser_snapshot` (indexed `ref`s + form state) → `browser_act`
-(click/fill/select/pick, returns a page diff) → `browser_wait_for` → `browser_extract` /
+(click/fill/select/pick, returns a page diff) → `browser_wait_for` → `browser_autoscroll`
+(drain lazy lists) → `browser_products` / `browser_collect` / `browser_extract` /
 `browser_screenshot`. Sensitive actions (pay / book / checkout) are **blocked** unless the
 agent passes `humanApproved`.
 
@@ -52,10 +54,13 @@ agent passes `humanApproved`.
 - **Stealth** — Patchright neutralizes the real automation signals; per-country identity + rotating proxy pool.
 - **Agentic targeting** — accessibility-style snapshot with stable refs, self-healing click/fill, multi-step plans.
 - **Vision (Set-of-Marks)** — `annotate:true` on `browser_snapshot`/`browser_act`/`browser_screenshot` draws numbered badges (= each `ref`) on the page, so vision models *see* it and target by ref.
-- **Sees everything** — open Shadow DOM, same/cross-origin iframes, and **virtualized/infinite lists** (`browser_collect`).
+- **Sees everything** — open Shadow DOM, same/cross-origin iframes, and **virtualized/infinite lists** (`browser_collect`, `browser_autoscroll` to drain lazy-loaded results first).
+- **Structured extraction** — `browser_products` pulls **per-card** rows (`{title, price, currency, url?}`, each price tied to its own title) by detecting repeated card containers — works on Digitec, Booking, Amazon… Sort by price to answer "which is the cheapest?". **Layout-agnostic prices**: prefix/suffix currency, thousands/decimal markup, CH/EU formats. Also exposed as the CLI `products` command.
+- **Full session control** — multi-tab (`browser_tabs`: list/new/select/close popups & OAuth windows), native dialog policy (`browser_dialog`), captured `browser_downloads`, plus `browser_console` / `browser_network` logs to debug why a page misbehaves.
 - **Fast-path** — `browser_fetch` impersonates a real Chrome TLS fingerprint for server-rendered HTML, no browser launch — returns clean **markdown** and optional **contacts** (`extractContacts`) at ~HTTP speed. **JSON APIs / plain text** come back verbatim (no HTML mangling). Opt-in **`browserFallback`** auto-renders client-side (SPA/CSR) pages in a real browser when the HTTP response is an empty shell (`escalated: true`). **`browser_fetch_batch`** fetches many URLs in parallel (bounded concurrency, errors isolated per URL). **`browser_crawl`** walks a whole site (bounded same-origin BFS, robots-honored) → clean markdown per page. **`browser_shots_batch`** captures responsive full-page screenshots of many URLs in parallel (see the design of a whole set of pages at once). **`browser_collect_batch`** exhausts the infinite-scroll list of many listing URLs at once (crawl finds the pages, collect drains them). **`browser_site_shots`** snapshots a whole site in one call — crawl + screenshot each page, returning content **and** responsive PNGs per page.
 - **Data out** — multi-currency prices, typed CSS extraction, **contact extraction** (emails/phones E.164, `fastPathFirst` cascade), a clean→validate→dedupe→emit pipeline, CSV export, Google SERP rank tracking.
-- **Ops** — persistent sessions, **auto crash recovery** (a crashed page is recreated in the same context and restored to its last URL between calls), opt-in **per-host circuit breaker** + **bounded probe queue/budget** + **`browser_metrics`** for mass scraping, **live view** (watch any session — even headless — in your browser), `storageState` auto-save, HAR record/replay, pixel `visual_diff`, human handoff for login/2FA.
+- **Ops** — persistent sessions, **auto crash recovery** (a crashed page is recreated in the same context and restored to its last URL between calls), opt-in **per-host circuit breaker** + **bounded probe queue/budget** + **`browser_metrics`** for mass scraping, **live view** (watch any session — even headless — in your browser), **`screenshot://{sessionId}/last` MCP resource** (read a session's current page as a JPEG on demand), `storageState` auto-save, **named auth profiles** (`profile`), **`blockResources`** to skip images/fonts/etc. on batch runs, HAR record/replay, pixel `visual_diff`, human handoff for login/2FA.
+- **Context control** — **`FUSE_CAPS`** registers only the tool groups you need (`core`/`batch`/`extract`/`debug`/`live`) for a lighter LLM context, and the batch tools emit MCP **progress notifications** when the client sends a `progressToken`.
 
 ## Documentation
 
@@ -63,7 +68,7 @@ Full reference in **[`docs/`](./docs/README.md)**:
 
 [Installation](./docs/installation.md) ·
 [CLI](./docs/cli.md) ·
-[MCP tools (37)](./docs/mcp-tools.md) ·
+[MCP tools (44)](./docs/mcp-tools.md) ·
 [Configuration](./docs/configuration.md) ·
 [Sessions](./docs/sessions.md) ·
 [Extraction](./docs/extraction.md) ·
diff --git a/docs/README.md b/docs/README.md
index db41914..0f476f4 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -6,8 +6,8 @@ New here? Start with the root [README](../README.md), then dive in:
 | Doc | What's inside |
 | --- | --- |
 | [Installation](./installation.md) | Requirements, install, Chromium, MCP registration, the three ways to get a browser |
-| [CLI](./cli.md) | `probe` / `fetch` / `fetch-batch` / `crawl` / `collect-batch` / `shots` / `shots-batch` / `site-shots` / `serp-batch` + every flag |
-| [MCP tools](./mcp-tools.md) | All 37 tools with parameters and examples |
+| [CLI](./cli.md) | `probe` / `fetch` / `fetch-batch` / `crawl` / `collect-batch` / `shots` / `shots-batch` / `site-shots` / `serp-batch` + one-shot page commands (`run` / `products` / `extract` / `snapshot` / `screenshot` / `inspect`) + every flag |
+| [MCP tools](./mcp-tools.md) | All 44 tools with parameters and examples |
 | [Configuration](./configuration.md) | `AgentOptions`, `FUSE_*` env vars, identity, retry, output location |
 | [Sessions](./sessions.md) | Session lifecycle, auto crash recovery, `storageState` auto-save, HAR record/replay, CDP attach |
 | [Extraction](./extraction.md) | `browser_extract` / `extract_schema` / `collect` + the clean→validate→dedupe→emit pipeline |
diff --git a/docs/cli.md b/docs/cli.md
index 9e9a88e..53366dc 100644
--- a/docs/cli.md
+++ b/docs/cli.md
@@ -1,6 +1,6 @@
 # CLI
 
-`fuse-browser` is a command-line front-end for the browser agent. It exposes nine one-shot subcommands (`probe`, `fetch`, `fetch-batch`, `crawl`, `collect-batch`, `serp-batch`, `shots`, `shots-batch`, `site-shots`) that all share a single flag parser (`node:util` `parseArgs`, strict mode), so any flag is accepted globally but only consumed by the subcommands documented below. Session-based interaction (open/navigate/click/products/autoscroll/…) is exposed through the MCP server (`browser-mcp`), not the CLI.
+`fuse-browser` is a command-line front-end for the browser agent. It exposes 15 one-shot subcommands — nine batch/fast-path commands (`probe`, `fetch`, `fetch-batch`, `crawl`, `collect-batch`, `serp-batch`, `shots`, `shots-batch`, `site-shots`) plus six [page commands](#page-commands-one-shot) (`run`, `products`, `extract`, `snapshot`, `screenshot`, `inspect`) — that all share a single flag parser (`node:util` `parseArgs`, strict mode), so any flag is accepted globally but only consumed by the subcommands documented below. Stateful, multi-turn session interaction (open → navigate → click → snapshot → …) is exposed through the MCP server (`browser-mcp`), not the CLI.
 
 ```
 fuse-browser probe <url> [flags]
diff --git a/docs/configuration.md b/docs/configuration.md
index f20cc13..7dd3463 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -78,6 +78,8 @@ Read by `envAgentDefaults` (`src/server/env-defaults.ts`) and the proxy loader (
 | `FUSE_STORAGE_STATE` | `storageStatePath` | Path to a storage-state JSON. |
 | `FUSE_OUTPUT_DIR` | `outputDir` | Override the artifact output directory. |
 | `FUSE_PROXIES` | proxy pool | Comma- or newline-separated proxy URLs; deduped, blanks dropped. Merged with `proxiesPath`. Treat as a secret. |
+| `FUSE_CAPS` | tool-group filter | Comma-separated [capability groups](./mcp-tools.md#capability-groups-fuse_caps) to register (`core`/`batch`/`extract`/`debug`/`live`). Case-insensitive, whitespace-tolerant; unknown names are ignored. Blank/unset (or only-unknown) = all 44 tools. Server-only (no per-call/library equivalent). |
+| `FUSE_NETLOG_MAX` | network/console log cap | Max entries kept per session in `browser_console` / `browser_network` (oldest dropped). Positive integer; default `250`. |
 
 ### MCP config example
 
diff --git a/docs/mcp-tools.md b/docs/mcp-tools.md
index 7f332d4..322ee07 100644
--- a/docs/mcp-tools.md
+++ b/docs/mcp-tools.md
@@ -1,10 +1,11 @@
 # MCP tools
 
-Complete reference for the 37 `browser_*` tools exposed by the fuse-browser MCP server.
+Complete reference for the 44 `browser_*` tools exposed by the fuse-browser MCP server.
 
 Tools fall into two families:
 
 - **One-shot / fast-path** (`browser_probe`, `browser_probe_html`, `browser_fetch`, `browser_fetch_batch`, `browser_crawl`, `browser_collect_batch`, `browser_shots_batch`, `browser_site_shots`, `browser_serp_batch`) open a fresh browser (or do a pure HTTP fetch) per call and return a report. No session id needed.
+- **Structured extraction** (`browser_products`, `browser_collect`, `browser_extract`, `browser_extract_schema`) and `browser_autoscroll` (drain lazy lists) run against a live session.
 - **Session tools** require a `sessionId` obtained from `browser_open` (or `browser_connect`). They drive one persistent, stateful page.
 
 Every field is optional unless **Required** says `yes`. Defaults shown below come from the tool itself; many can also be set globally via `FUSE_*` environment variables — see [configuration](./configuration.md). Per-call arguments always override env defaults.
@@ -13,13 +14,13 @@ The shared identity/profile options (the `agentOptionShape`) are listed once und
 
 ## Capability groups (`FUSE_CAPS`)
 
-By default all 37 tools are registered. Set the `FUSE_CAPS` env var (comma-separated group names) to expose fewer tools — a lighter context for the LLM client:
+By default all 44 tools are registered. Set the `FUSE_CAPS` env var (comma-separated group names) to expose fewer tools — a lighter context for the LLM client:
 
 | Group | Tools |
 | --- | --- |
-| `core` | Session lifecycle (`browser_open`/`browser_status`/`browser_close`/`browser_connect`), navigation (`browser_navigate`/`browser_back`/`browser_forward`), actions (`browser_click`/`browser_fill`/`browser_login`/`browser_scroll`/`browser_press`/`browser_select`), `browser_tabs`, `browser_dialog`/`browser_downloads`, `browser_snapshot`/`browser_act`, `browser_wait`/`browser_wait_for`, `browser_screenshot`. |
+| `core` | Session lifecycle (`browser_open`/`browser_status`/`browser_close`/`browser_connect`), navigation (`browser_navigate`/`browser_back`/`browser_forward`), actions (`browser_click`/`browser_fill`/`browser_login`/`browser_scroll`/`browser_press`/`browser_select`), `browser_tabs`, `browser_dialog`/`browser_downloads`, `browser_snapshot`/`browser_act`, `browser_wait`/`browser_wait_for`, `browser_screenshot`, `browser_autoscroll`. |
 | `batch` | `browser_probe`, `browser_probe_html`, `browser_fetch`, `browser_fetch_batch`, `browser_crawl`, `browser_collect_batch`, `browser_shots_batch`, `browser_site_shots`, `browser_serp_batch`. |
-| `extract` | `browser_collect`, `browser_run`, `browser_extract`, `browser_extract_schema`. |
+| `extract` | `browser_collect`, `browser_run`, `browser_extract`, `browser_extract_schema`, `browser_products`. |
 | `debug` | `browser_inspect`, `browser_console`, `browser_network`, `browser_visual_diff`, `browser_metrics`. |
 | `live` | `browser_handoff`, `browser_live_view`, `browser_live_view_stop`. |
 
@@ -588,6 +589,25 @@ The optional `pipeline` runs a declarative clean→validate→dedupe→emit pass
 { "sessionId": "s_abc123", "item": ".result-card", "extractPrices": true, "maxSteps": 20 }
 ```
 
+### browser_autoscroll
+
+Repeatedly scroll a long / infinite list to the bottom to trigger lazy-load until it stabilises — run it **before** `browser_extract` / `browser_collect` / `browser_products` on lazy-loaded result pages so every item is in the DOM. Stops after `idleRounds` rounds without growth, at `maxScrolls`, or once `untilSelector` reaches `minCount` elements.
+
+| Param | Type | Required | Description |
+| --- | --- | --- | --- |
+| `sessionId` | string | yes | Target session. |
+| `maxScrolls` | integer | no | Hard cap on scroll rounds. |
+| `idleRounds` | integer | no | Stop after this many rounds with no height growth. |
+| `untilSelector` | string | no | Stop once this selector reaches `minCount` matches. |
+| `minCount` | integer | no | Element count target for `untilSelector`. |
+| `delayMs` | integer | no | Pause between scroll rounds. |
+
+Returns `{ rounds, height, url }`.
+
+```json
+{ "sessionId": "s_abc123", "untilSelector": ".result-card", "minCount": 100 }
+```
+
 ---
 
 ## Extract
@@ -626,6 +646,22 @@ Extract typed data from the live page via a field map. Deterministic; reads the
 }
 ```
 
+### browser_products
+
+Extract structured **per-card** product rows from an e-commerce / search-results page: one `{title, price, currency, url?}` per card, each price tied to its own title (unlike flat price scraping). Generic — detects repeated card containers by structure, so it works on Digitec, Booking, Amazon… Prices are parsed **layout-agnostically** (prefix/suffix currency, thousands/decimal markup, CH/EU formats). Sort the rows by price to answer "which is the cheapest?". Also exposed as the CLI `products` command.
+
+| Param | Type | Required | Description |
+| --- | --- | --- | --- |
+| `sessionId` | string | yes | Target session. |
+| `limit` | integer | no | Cap the number of returned rows. |
+| `containerSelector` | string | no | Pin the card-container selector (auto-detected otherwise). |
+
+Returns `{ url, count, products: [{ title, price, currency, url? }] }`.
+
+```json
+{ "sessionId": "s_abc123", "limit": 20 }
+```
+
 ---
 
 ## SERP