Skip to content

Add LP.setSubframeLoading + --disable-subframes opt-out for iframe loading#2401

Merged
karlseguin merged 2 commits into
lightpanda-io:mainfrom
staylor:feat/cdp-disable-iframes
May 12, 2026
Merged

Add LP.setSubframeLoading + --disable-subframes opt-out for iframe loading#2401
karlseguin merged 2 commits into
lightpanda-io:mainfrom
staylor:feat/cdp-disable-iframes

Conversation

@staylor
Copy link
Copy Markdown
Contributor

@staylor staylor commented May 8, 2026

Summary

Adds two ways to opt out of subframe loading entirely -- useful as a workaround for #2400 (child-iframe navigation invalidates the main frame's executionContextId):

  • CDP method LP.setSubframeLoading { enabled: bool } -- per-session opt-in toggleable at runtime by the driver.
  • CLI flag --disable-subframes -- process-wide default, applies to every session and to the fetch subcommand. Operators can flip it on without any driver changes.

When subframe loading is off, the HTML parser registers <iframe> elements in the DOM (they're still in the tree if the driver inspects via DOM.getDocument or LP.getMarkdown) but skips child Frame creation, document fetch, and the corresponding Page.frameAttached / Page.frameNavigated / Runtime.executionContextCreated events. The driver only sees the main frame's lifecycle.

Motivation

#2400 is the underlying issue: every child-iframe navigation in Lightpanda re-emits Runtime.executionContextCreated on the main frame's V8 contexts (because IsolatedWorld is shared per-BrowserContext and CONTEXT_GROUP_ID is a constant). V8's inspector treats that as a re-registration with a fresh executionContextId, invalidating the id the driver had pinned for the main frame's main world / utility world. Subsequent Runtime.evaluate fails with -32000 "Cannot find context with specified id", which Playwright surfaces as "Execution context was destroyed, most likely because of a navigation." and Puppeteer surfaces as IsolatedWorld.evaluate hangs.

The proper fix needs per-frame V8 inspector context groups (or per-frame IsolatedWorld), discussed in #2400. That's a meaningful refactor. This PR is the workaround so users hitting page.title() / page.evaluate(...) failures on iframe-heavy pages (Shopify storefronts, ad-heavy news sites, anywhere with web-pixel sandboxes) have a clean opt-in escape today.

Implementation

Session.subframe_loading_enabled: bool = true -- default matches existing behavior.

Frame.iframeAddedCallback short-circuits when the flag is false, marking the iframe _executed = true so the parser doesn't re-deliver it:

if (!self._session.subframe_loading_enabled) {
    iframe._executed = true;
    return;
}

Two ways to flip the flag:

  1. LP.setSubframeLoading { enabled } (src/cdp/domains/lp.zig) -- CDP method on the existing LP domain. Sets bc.session.subframe_loading_enabled.

  2. --disable-subframes CLI flag (src/Config.zig) -- added to CommonOptions (so it applies to serve, fetch, mcp). New Config.disableSubframes() getter; Session.init reads it as the initial value. The CDP method can override per-session at runtime regardless of the CLI default.

Total diff: +75 / 0 across 4 files (src/Config.zig, src/browser/Session.zig, src/browser/Frame.zig, src/cdp/domains/lp.zig).

Verification

Reproducer: puppeteer-core 24.42.0 against https://www.allbirds.com/products/mens-wool-runners (page instantiates ~11 web-pixel iframes during initial render).

Baseline (no fix) -- page loads, but the worker re-entrancy bug from #2398 also bites and the server segfaults on disconnect; iframe executionContextCreated churn happens in the trace.

With LP.setSubframeLoading({ enabled: false }):

[opt-out] LP.setSubframeLoading reply: {}
[ok] goto status=200 elapsed=6166ms
[stats] frame_attached_events_seen=0
[ok] page.title() = "Allbirds Wool Runners, Men's | Reviews, SIzing Info | Casual Walking, Running Shoes"
[ok] evaluate(1+1) = 2
[ok] evaluate(document.title) = "Allbirds Wool Runners, Men's | Reviews, SIzing Info | Casual Walking, Running Shoes"
[ok] body.innerHTML.length = 923161

With --disable-subframes CLI flag (no driver-side opt-in):

serve --disable-subframes + plain puppeteer-core goto
  [ok] goto status=200 elapsed=6354ms frameAttached=0

fetch --disable-subframes --dump html https://www.allbirds.com/products/mens-wool-runners
  exit=0
  html bytes: 1021562
  title: <title>Allbirds Wool Runners, Men's | ...</title>
  iframe count in dumped html: 2  (still in DOM, just not loaded)

521/521 unit tests pass.

Notes

  • For playwright-core chromium.connectOverCDP, the CDP method path is awkward: both BrowserContext.newCDPSession(page) and Browser.newBrowserCDPSession() open a new CRSession that collides with Promote synthetic STARTUP session for Playwright connectOverCDP #2399's STARTUP-session reuse and triggers Playwright's internal assert(!object.id) in crConnection.js. The --disable-subframes CLI flag is the recommended path for Playwright users for now.

  • This intentionally doesn't prevent iframes from existing in the DOM -- document.querySelectorAll('iframe') still returns them, LP.getMarkdown and LP.getSemanticTree still see them -- it just stops their content from being fetched and processed. That preserves any selector / scraping logic that relies on inspecting the iframe tags themselves.

  • Once Child iframe navigation invalidates main frame's executionContextId for CDP drivers #2400's underlying architectural fix lands (per-frame V8 inspector context groups or per-frame IsolatedWorld), this method becomes a niche performance / sandboxing tool rather than a correctness workaround. Worth keeping anyway: blocking analytics / pixel iframes is a reasonable thing to want to do.

Related

staylor added 2 commits May 8, 2026 17:05
Adds a Lightpanda-specific CDP method that lets drivers opt out of
subframe processing entirely:

  await client.send('LP.setSubframeLoading', { enabled: false });

When disabled, the HTML parser silently bypasses every <iframe> it
encounters: no child Frame is created, no document fetch is issued,
and no Page.frameAttached / Page.frameNavigated /
Runtime.executionContextCreated events are emitted. The driver only
sees the main frame's lifecycle.

Motivation: pages that load large numbers of analytics / pixel
iframes (Shopify storefronts, ad-heavy news sites) trigger lightpanda-io#2400
\u2014 each subframe navigation re-registers the main frame's V8 context
under the child's frameId and invalidates the executionContextId the
driver had pinned for the main frame. Subsequent Runtime.evaluate
fails with 'Cannot find context with specified id' (Playwright
surfaces this as 'Execution context was destroyed', Puppeteer hangs
in IsolatedWorld.evaluate waiting for a 'context' event). The proper
fix is per-frame V8 inspector context groups (or per-frame
IsolatedWorld), discussed in lightpanda-io#2400; this method gives drivers a
clean opt-in workaround in the meantime.

Mechanism: new bool field Session.subframe_loading_enabled (default
true). Frame.iframeAddedCallback short-circuits when false, marking
the iframe as _executed so the parser doesn't re-deliver it.

Verified against the puppeteer-core repro on
https://www.allbirds.com/products/mens-wool-runners (which
instantiates ~11 web-pixel iframes during initial render):

  baseline (subframe loading ON):
    page.title()  works (lucky timing) but server segfaults on
                  disconnect from the worker re-entrancy bug; iframes
                  do load and trigger the executionContextId churn

  with LP.setSubframeLoading(false):
    [opt-out] LP.setSubframeLoading reply: {}
    [ok] goto status=200 elapsed=6166ms
    [stats] frame_attached_events_seen=0
    [ok] page.title() = "Allbirds Wool Runners, Men's | ..."
    [ok] evaluate(1+1) = 2
    [ok] evaluate(document.title) = "Allbirds Wool Runners, Men's | ..."
    [ok] body.innerHTML.length = 923161

521/521 unit tests still pass.
Complementary to LP.setSubframeLoading (preceding commit): exposes
the same iframe-skip behavior as a CLI option that applies to all
sessions in the process. Useful for:

  * the 'fetch' subcommand (no CDP driver to call LP.setSubframeLoading)
  * 'serve' deployments where the operator wants iframes off by
    default for every connecting client (the LP method can still
    re-enable per-session if needed)
  * Playwright's chromium.connectOverCDP, which can't reliably issue
    custom CDP methods on Lightpanda today: BrowserContext.newCDPSession
    and Browser.newBrowserCDPSession both attach a new CRSession that
    collides with the STARTUP-session reuse from lightpanda-io#2399, triggering a
    Playwright internal assertion. With --disable-subframes set on the
    server, Playwright doesn't need to issue any custom CDP \u2014 every
    session inherits subframes-off and the executionContextId churn
    from lightpanda-io#2400 never trips.

Verified:

  serve --disable-subframes + plain puppeteer-core goto
    [ok] goto status=200 elapsed=6354ms frameAttached=0

  fetch --disable-subframes --dump html https://www.allbirds.com/...
    exit=0
    html bytes: 1021562
    title: <title>Allbirds Wool Runners, Men's | ...</title>
    iframe count in dumped html: 2  (still in DOM, just not loaded)

521/521 unit tests pass.
@staylor
Copy link
Copy Markdown
Contributor Author

staylor commented May 10, 2026

@krichprollsch krichprollsch self-requested a review May 11, 2026 09:54
@krichprollsch
Copy link
Copy Markdown
Member

I like the idea of giving the control to disable sub frame's loading to the client 👍
I agree with you, it doesn't fix #2400, but it can be a workaround for some specific scenarios and it can be useful in general.

I'm wondering if we should use a more generic CDP command/CLI option to prepare the future.
I'm thinking about disabling external CSS loading if we start supporting them.
Maybe LP.configureLoading with {"subFrame": false} (default would be true) and in the future {"subFrame": false, "css":false}.
for CLI it could be --disable-loading subframe (and in the future --disable-loading subframe,css).

But I don't have strong opinion, the risk is to have something too generic.
WDYT @karlseguin?

We need an e2e test in demo to reproduce the issue.
I will take a look at it.

Comment thread src/browser/Frame.zig
// parser doesn't keep handing it back to us, but skip the child
// frame creation / navigation / notification entirely — no child
// Frame, no Page.frameAttached, no Runtime.executionContextCreated.
iframe._executed = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to add a debug log in this case (and moving the block after src creation)

log.debug(.frame, "skip iframe loading", .{.src = src});

@karlseguin
Copy link
Copy Markdown
Collaborator

I agree for CDP, a single command is nice. It lets you disable multiple things in a single call. For CLI, feels less important.

@karlseguin karlseguin merged commit b272b0e into lightpanda-io:main May 12, 2026
94 of 99 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators May 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants