Skip to content

feat: widget buffers in blob store + update_comm request type with coalescing #811

@rgbkrk

Description

@rgbkrk

Phase D of #761: Widget buffers in blob store + update_comm request

After Phase C (#810) eliminates the broadcast paths, this phase unifies the binary data strategy and adds a clean request type for frontend-initiated widget state changes.

Widget buffers through the blob store

Currently, widget binary buffers (numpy arrays, image data) are carried inline as Vec<Vec<u8>> on CommSnapshot.buffers and base64-encoded in JSON broadcasts. This inflates data by 33% and bypasses the blob store.

New approach: When the daemon processes a comm_open or comm_msg with buffers:

  1. Store each buffer in the blob store → get hash
  2. Walk buffer_paths, replace each path in state with {"$blob": "<hash>"} sentinel
  3. Write the modified state JSON to doc.comms[comm_id].state

Frontend hydration: When reading widget state from the doc:

  1. Parse state JSON
  2. Scan for {"$blob": hash} sentinels
  3. Fetch each blob via HTTP → ArrayBuffer
  4. Replace sentinels with ArrayBuffers
  5. Pass hydrated state to widget renderer

For the security iframe: The parent window resolves all blob refs before posting via postMessage (structured clone transfers ArrayBuffers efficiently). The iframe never accesses the blob HTTP server directly.

update_comm request type

Currently, frontend-initiated widget state changes (slider drag) go through SendComm which wraps a full Jupyter comm_msg. This is overloaded — SendComm handles both state updates and custom messages.

New request:

{"action": "update_comm", "comm_id": "widget-1", "state_delta": {"value": 42}}

The daemon:

  1. Reads current state from doc.comms[comm_id].state
  2. Merges delta
  3. Writes merged state to doc
  4. Sends comm_msg with method: "update" to kernel
  5. Responds {"result": "ok"}

Coalescing for high-frequency updates: The daemon coalesces rapid update_comm requests (configurable window, default 16ms). Multiple deltas for the same comm_id within the window are merged into one doc write and one kernel message. This bounds CRDT history growth during slider drags.

Frontend optimistic updates: The frontend applies state changes locally (in WidgetStore / React state) immediately. The doc sync carries the daemon's write back, reconciling with the optimistic state.

SendComm retained for custom messages only

SendComm stays but is restricted to method: "custom" messages (button clicks, ipycanvas draw commands, model.send()). These are opaque, imperative, and can't be modeled as state.

Implementation

crates/notebook-doc/:

  • update_comm_state_with_blobs(comm_id, state_json, buffer_hashes, buffer_paths) — resolves buffer_paths to {"$blob": hash} sentinels in state JSON

crates/runtimed/:

  • In IOPub handler: store buffers in blob store before writing to doc
  • Add NotebookRequest::UpdateComm { comm_id, state_delta } variant
  • Handle in handle_notebook_request: read current state, merge delta, write to doc, forward to kernel
  • Add coalescing: buffer rapid UpdateComm requests, flush on 16ms timer
  • Remove CommSnapshot.buffers field (replaced by blob refs in state)

crates/notebook-protocol/:

  • Add UpdateComm to NotebookRequest enum
  • Remove buffers: Vec<Vec<u8>> from CommSnapshot (or keep as legacy compat)

Frontend:

  • Blob hydration layer: scan state JSON for {"$blob": hash}, fetch from HTTP, replace with ArrayBuffer
  • useCommRouter.sendUpdate() → sends UpdateComm request instead of full SendComm
  • Remove base64 buffer encoding from outbound widget messages (state deltas don't carry buffers — those go through blob store on the daemon side)

Testing

  • Widget with binary buffers (plotly, ipyimage) → buffers in blob store, refs in state
  • New window sees widget with binary data (blob refs resolve correctly)
  • Slider drag at 60Hz → coalesced to ~60 doc writes/sec (or less with 16ms window)
  • Button click → SendComm with method: "custom" still works
  • Frontend optimistic update → reconciles with doc sync

Size

Medium-Large — blob integration + new request type + frontend hydration layer.

Part of #761. Depends on #810.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureArchitecture proposals and structural changesenhancementNew feature or requestipywidgetsWidget rendering, comm protocol, Output widgetssyncAutomerge CRDT sync protocol

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions