Skip to content

perf(engine,producer): worker-offload JPEG encode for drawElement fast capture#1444

Open
vanceingalls wants to merge 3 commits into
drawelement-fast-capturefrom
drawelement-worker-encode
Open

perf(engine,producer): worker-offload JPEG encode for drawElement fast capture#1444
vanceingalls wants to merge 3 commits into
drawelement-fast-capturefrom
drawelement-worker-encode

Conversation

@vanceingalls

Copy link
Copy Markdown
Collaborator

What

Brief description of the change.

Why

Why is this change needed?

How

How was this implemented? Any notable design decisions?

Test plan

How was this tested?

  • Unit tests added/updated
  • Manual testing performed
  • Documentation updated (if applicable)

Copy link
Copy Markdown
Collaborator Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

vanceingalls and others added 3 commits June 14, 2026 22:58
…t capture

Moves per-frame JPEG encode (~7.4ms, 57% of frame cost) off the page main
thread into an in-page OffscreenCanvas Worker, then pipelines so frame N
encodes while frame N+1 seeks+paints. Target ~1.65× (1 worker) wall-time
speedup on macOS hardware-GPU drawElement renders.

New machinery:
- EngineConfig.enableDrawElementWorkerEncode (default false, env HF_DE_WORKER_ENCODE)
- drawElementService: WorkerEncodeState, initDrawElementWorkerEncode,
  cleanupDrawElementWorkerEncode, produceDrawElementFrame
- frameCapture: CaptureSession.workerEncodeEnabled, captureFrameToBufferPipelined
- captureStreamingStage: runWorkerEncodePipelineLoop helper + depth-2 dispatch

Gated off by default. No effect on BeginFrame/Linux, SwiftShader, PNG, or
any render without useDrawElement+enableDrawElementWorkerEncode=true.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Liveness/correctness:
- Worker encode failures now propagate: the in-page worker wraps encode in
  try/catch + null-checks getContext and posts {id,error} on failure; its
  onerror posts a fatal signal (id=-1). The node binding rejects the matching
  pending promise (or all of them on fatal) instead of leaving encodeResult
  pending forever — previously any worker throw hung the render to timeout.
- Pipeline loop attaches a no-op catch to the orphaned in-flight encode on
  abort/throw so cleanup's rejection is not an unhandled promise rejection.
- drainPrev checks assertNotAborted before awaiting so aborts are observed
  while parked on the encode wait, not one frame later.
- captureFrameToBufferPipelined wraps capture in captureFrameErrorDiagnostics,
  restoring per-frame frame-error PNG/HTML/JSON the serial path produced.

Lifecycle/efficiency:
- __hfFrameReady binding tracked via a separate workerEncodeBoundPages WeakSet
  so a re-init after cleanup doesn't call exposeFunction twice ('already exists').
- URL.revokeObjectURL after Worker construction (was leaked per init).
- Drop the unconditional setTimeout(0) around createImageBitmap (~1-4ms/frame
  of macrotask latency on the produce critical path; encode runs in the worker).
- Reset nextId on session reuse; remove the unreachable BeginFrame branch from
  the pipelined path (gated to beginFrameTimeTicks===0).

Base64 stays in the worker (off the main thread) by design; a binary
side-channel to node is a follow-up. Gated off by default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-authored-by: Miguel Ángel <miguel07alm@protonmail.com>
… (re-review)

Re-review found the prior orphan-rejection fix attached to the wrong promise.

- Unhandled-rejection guard moved to source: produceDrawElementFrame attaches a
  no-op .catch to every encodeResult at creation, covering the depth-2 loop's
  orphaned in-flight frame. Removed the ineffective loop-level prev.catch.
- Per-frame encode watchdog (30s): a lost worker message no longer hangs the
  render to the protocol timeout (onerror->id=-1 only covered crashes).
- Empty-frame guard: payload-less worker success rejects instead of resolving a
  0-byte Buffer ffmpeg would write as a corrupt frame.
- Worker reuses one OffscreenCanvas across frames (was per-frame alloc).
- Close the ImageBitmap on the worker-missing reject path (GPU leak).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vanceingalls vanceingalls force-pushed the drawelement-worker-encode branch from 1b418a9 to 52fd9c2 Compare June 15, 2026 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant