Skip to content

fix(mcp): browser-use screenshot saving path and inline rendering#166

Open
Gucc111 wants to merge 1 commit into
OpenBMB:mainfrom
Gucc111:fix/browser-screenshot-render
Open

fix(mcp): browser-use screenshot saving path and inline rendering#166
Gucc111 wants to merge 1 commit into
OpenBMB:mainfrom
Gucc111:fix/browser-screenshot-render

Conversation

@Gucc111
Copy link
Copy Markdown
Collaborator

@Gucc111 Gucc111 commented Jun 5, 2026

Summary

  • Set cwd: outDir on the browser-use MCP spec so that screenshots (both auto-named and user-specified filename) are saved to projectRoot/.pilotdeck/browser_screenshots/<sessionId>/ instead of an unpredictable inherited CWD.
  • Add a file-reading fallback in marshalMcpContent(): when @playwright/mcp returns only a text block with a Markdown image link (no base64 — happens when the LLM passes filename), read the referenced file from disk and inject it as an inline image block, so the existing toolResultImages rendering pipeline can display it.

Root cause

@playwright/mcp has two path-resolution strategies for browser_take_screenshot:

LLM passes filename? File saved to Response contains
No --output-dir (absolute) text + image (base64) blocks
Yes workspaceFile(filename) → relative to process CWD text block only (Markdown link)

PilotDeck spawned the browser-use MCP process without setting cwd, so process.cwd() was inherited from the PilotDeck parent — an unpredictable location. Screenshots with filename were written to the wrong directory (or silently failed), and even when the file existed, the response lacked base64 data for the UI to render.

Test plan

  • Start a session, ask the agent to take a screenshot (triggers browser_take_screenshot)
  • Verify the screenshot file appears in projectRoot/.pilotdeck/browser_screenshots/<sessionId>/
  • Verify the screenshot renders inline in the tool result area of the Web UI
  • Refresh the page and verify the screenshot is still visible (session history path)
  • Test with a model that does NOT pass filename — verify base64 inline rendering still works
  • Test with a model that passes filename — verify the file-reading fallback kicks in

Made with Cursor

… as inline images

The browser-use MCP process inherited an unpredictable CWD from the
parent PilotDeck process, so screenshots saved with a user-specified
`filename` parameter ended up in the wrong directory. Set `cwd: outDir`
on the per-session spec so both auto-generated and named screenshots
land in `projectRoot/.pilotdeck/browser_screenshots/<sid>/`.

When @playwright/mcp returns only a text block with a Markdown file
reference (no base64 image block — happens when `filename` is provided),
read the referenced image from disk and inject it as an inline image
block so it renders in the chat UI through the existing toolResultImages
pipeline.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant