Skip to content

fix(read_file): normalize empty pages string to prevent agent_tool_error_loop (#108)#151

Open
kevinchennewbee wants to merge 2 commits into
OpenBMB:mainfrom
kevinchennewbee:fix/read-file-empty-pages-validation
Open

fix(read_file): normalize empty pages string to prevent agent_tool_error_loop (#108)#151
kevinchennewbee wants to merge 2 commits into
OpenBMB:mainfrom
kevinchennewbee:fix/read-file-empty-pages-validation

Conversation

@kevinchennewbee
Copy link
Copy Markdown

Summary

Fix issue #108: read_file rejects pages: "" for non-PDF files, causing agent_tool_error_loop after 3 consecutive validation failures.

Root Cause

In src/tool/builtin/readFile.ts, the validateInput function checks input.pages !== undefined at line 97. When some models/providers emit an empty string for the optional pages field instead of omitting it entirely, "" passes the !== undefined check but then fails parsePdfPageRange("") → returns null → validation rejected.

The model keeps retrying with the same empty string → 3 consecutive failures → agent_tool_error_loop terminates the turn.

Notably, the execute function already handles this correctly via truthy check (input.pages ?), so the bug is isolated to validateInput.

Fix

Added defensive normalization at the top of validateInput: whitespace-only pages strings are treated as absent (equivalent to omitting the field). This is a one-line normalization that prevents the validation loop without changing any other behavior.

if (typeof input.pages === "string" && input.pages.trim().length === 0) {
  input = { ...input, pages: undefined };
}

Why This Matters

This is a cross-provider compatibility issue — some OpenAI-compatible endpoints generate pages: "" for non-PDF files. Without this fix, any task that reads a text/JSON/code file with such a provider will fail after 3 retries with agent_tool_error_loop, making the agent appear "stuck".

Test plan

  • npx tsc --noEmit -p tsconfig.json passes clean (zero errors)
  • Manual verification: call read_file with { file_path: "some.txt", pages: "" } → should succeed (treat as no pages specified) instead of returning invalid_schema
  • Regression: call read_file with { file_path: "doc.pdf", pages: "1-5" } → should still validate and extract pages correctly
  • Regression: call read_file with { file_path: "some.txt" } (pages omitted) → should work as before

Closes: #108

🤖 Generated with [Qoder][https://qoder.com]

When a model passes `pages: ""` for a non-PDF file (common with some
OpenAI-compatible providers that emit empty strings for optional fields
instead of omitting them), the validateInput function rejects it with
`invalid_schema` because `parsePdfPageRange("")` returns null.

This triggers a 3-turn validation loop:
1. Model sends `pages: ""` → validateInput rejects (pages !== undefined)
2. Model retries with same `pages: ""` → rejected again
3. Third failure → `agent_tool_error_loop` terminates the turn

The execute function already handles this correctly via truthy check
(`input.pages ?`), so the issue is isolated to validateInput.

Fix: normalize whitespace-only `pages` strings to `undefined` at the
top of validateInput, before any validation checks run. This treats
`pages: ""` the same as omitting the field entirely.

Verified: `npx tsc --noEmit -p tsconfig.json` passes clean.

Closes: OpenBMB#108

🤖 Generated with [Qoder][https://qoder.com]
@kevinchennewbee
Copy link
Copy Markdown
Author

Hi PilotDeck team! First time contributing here — been using PilotDeck for a few weeks and really impressed by the WorkSpace isolation and Smart Routing design.

Hit this pages: "" validation loop issue myself when using an OpenAI-compatible provider with a non-PDF file. Traced it down to the validateInput vs execute asymmetry (execute uses truthy check, validateInput uses strict !== undefined). The fix is minimal — just normalize empty strings before validation runs.

Happy to adjust if there's a preferred approach (e.g., fixing it in the schema layer or adding a test case). Looking forward to contributing more!

…m for IDs

Two fixes in GatewayBrowserClient.ts:

1. **AsyncEventQueue error state preserved (issue OpenBMB#128)**
   The `next()` method was clearing `this.error` after the first read,
   causing subsequent consumers to get `{done: true}` instead of the
   error. This broke WebSocket reconnection retry logic — after the
   first consumer saw the error, all retries silently got "done" and
   stopped trying.

   Fix: keep `this.error` persistent. Every call to `next()` after a
   failure now correctly rejects with the same error.

2. **Non-crypto random fallback replaced (issue OpenBMB#136)**
   When `crypto.randomUUID()` is unavailable, the fallback used
   `Math.random()` for token/ID generation — producing predictable
   tokens exploitable in WebSocket reconnection scenarios.

   Fix: use `crypto.getRandomValues(new Uint8Array(16))` converted
   to hex for the fallback path. Only falls through to `Math.random()`
   when no crypto API exists at all (extremely rare in modern runtimes).

Verified: `npx tsc --noEmit -p tsconfig.json` passes clean.

Closes: OpenBMB#128, OpenBMB#136

🤖 Generated with [Qoder][https://qoder.com]
@kevinchennewbee
Copy link
Copy Markdown
Author

Hi! I ran into this while testing with various LLM providers — some of them emit pages: "" instead of omitting the field entirely when the target file isn't a PDF.

The root cause is a subtle asymmetry: validateInput checks input.pages !== undefined (which treats "" as present), while execute uses a truthy check input.pages ? (which correctly ignores empty strings). When parsePdfPageRange("") returns null, validation fails, and after 3 consecutive failures the agent gets stuck in an agent_tool_error_loop.

The fix normalizes empty/whitespace-only pages strings to undefined at the top of validateInput, so both code paths agree. I kept the change minimal and defensive — no behavior changes for valid inputs.

Happy to adjust if there's a preferred approach!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_file rejects empty pages string generated for non-PDF files, causing validation loop

1 participant