-
Notifications
You must be signed in to change notification settings - Fork 1
Open
0 / 40 of 4 issues completedOpen
0 / 40 of 4 issues completed
Copy link
Labels
architectureArchitecture proposals and structural changesArchitecture proposals and structural changesenhancementNew feature or requestNew feature or requestipywidgetsWidget rendering, comm protocol, Output widgetsWidget rendering, comm protocol, Output widgetssyncAutomerge CRDT sync protocolAutomerge CRDT sync protocol
Milestone
Description
Summary
Widget state currently lives in two ephemeral in-memory replicas — CommState (daemon, Rust) and WidgetStore (frontend, JS) — connected by a tokio::broadcast channel. This architecture has several problems:
- Silent data loss: Broadcast channel lag silently drops messages
- No persistence: Widget state dies with the kernel. A daemon restart loses everything.
CommSyncreplay is fragile: Late-joining clients get a snapshot sorted by insertion order, but if a widget references another that was created later (unlikely but possible with layout manipulation), the replay breaks.- No echo suppression: Frontend updates round-trip through the kernel and come back as duplicate store mutations, causing unnecessary React re-renders
- Output widget capture is a message-forwarding workaround: The daemon intercepts cell outputs and re-routes them as
comm_msg(custom)messages to Output widgets, because there's no shared document for the widget to read from.
The fix: put widget state in the Automerge CRDT document, the same way cell outputs and metadata already live there.
What moves to the CRDT
A comms/ map in the notebook doc (not a separate doc — see rationale below):
ROOT/
schema_version: u64 ← bump to 3
...existing cells/, metadata/...
comms/ ← Map keyed by comm_id (NEW)
{comm_id}/
target_name: Str ← "jupyter.widget"
model_module: Str ← "@jupyter-widgets/controls" | "anywidget" | ...
model_name: Str ← "IntSliderModel" | "OutputModel" | ...
state: Str ← JSON-encoded widget state (blob refs as {"$blob": "<hash>"})
outputs/ ← List<Str> (OutputModel only: manifest hashes, same format as cell outputs)
seq: u64 ← Insertion order for dependency-correct replay
Why same doc, not separate doc
The original version of this issue proposed a separate Automerge doc scoped to the kernel session. After the notebook-sync DocHandle refactor (#786) and the native metadata migration (#791), the same-doc approach is better:
- One sync mechanism, not two. The metadata migration proved that adding structured data to the notebook doc works. Adding
comms/is the same pattern —put_json_at_key()already exists. - No second sync connection per client. Each client already syncs the notebook doc. A second doc means a second
DocHandle, second sync task, second set of connection plumbing. DocHandle.with_doc()is atomic across cells and comms. The daemon can clear outputs, write todoc.comms[widget_id].outputs, and sync — all in one lock acquisition. With two docs, cross-doc atomicity requires coordination.- Output widget simplification. If cell outputs and widget outputs are in the same doc, the daemon writes to one of two locations using the same blob manifest pipeline. No custom message protocol needed.
- New clients get everything from one sync. No CommSync, no Phase 1.5 handshake.
The concerns about the same-doc approach are mitigable:
- Lifetime (kernel session vs notebook file): The daemon clears
doc.commson kernel shutdown. Same effect as destroying a separate doc. - High-frequency updates (slider drag): The daemon coalesces rapid updates (16ms window). Automerge overhead for a single scalar write is ~50-100 bytes sync message — trivial for a Unix socket.
- History growth: Compaction on kernel restart (snapshot the doc, discard history) handles this.
.ipynbpersistence: The save-to-disk path already selectively reads cells and metadata — it simply ignorescomms/.
What moves
| Data | Current location | CRDT location |
|---|---|---|
| Widget existence (open/close) | CommState HashMap + broadcast |
doc.comms map |
| Widget state (slider value, button style, etc.) | CommState + WidgetStore |
doc.comms[id].state |
| Output widget captured outputs | Custom message dance (daemon → broadcast → frontend → sendUpdate back) |
doc.comms[id].outputs — daemon writes directly, same manifest pipeline as cell outputs |
| Binary buffers (images, numpy arrays) | In-memory on CommSnapshot.buffers, base64 |
Blob store, hash refs in state as {"$blob": "<hash>"} sentinels |
What stays on events/broadcasts
| Data | Why |
|---|---|
Custom messages (method: "custom", model.send()) |
Non-idempotent ordered events (button clicks, ipycanvas draw commands). CRDTs model state, not event streams. |
ExecutionStarted, ExecutionDone, QueueChanged |
UI animation hints. The doc is authoritative; events are just fast-path signals. |
KernelError, EnvProgress, EnvSyncState, FileChanged |
Genuinely ephemeral |
What this eliminates
CommSynchandshake: New clients just sync the doc. The entire Phase 1.5 disappears.CommStatestruct (most of it): The daemon writes to the doc instead of maintaining a parallel HashMap. Output widget capture routing stays (it's routing logic, not state).- 5 broadcast variants:
CommSync,Comm(for open/update/close),Output,OutputsCleared,DisplayUpdate— all replaced by doc sync. Broadcast surface: 13 → 8 variants. - Echo suppression problem: CRDT merge of identical state is a no-op.
- Output widget custom message protocol: No more
{method: "output", output: ...}/{method: "clear_output", wait: bool}. The daemon writes captured outputs todoc.comms[widget_id].outputsusing the same blob manifest pipeline as cell outputs. The frontend renders them the same way. closedModelsset inWidgetStore: If it's not in the doc, it doesn't exist.sendUpdatefeedback loop for Output widgets.
Precedent from recent refactors
| PR | What it proved |
|---|---|
#786, #789 notebook-sync DocHandle |
with_doc(|doc| ...) gives synchronous access. SyncCommand enum shrank from ~15 variants to 4. Perfect for comm writes. |
| #791, #800 Native metadata | put_json_at_key() recursively stores JSON as native Automerge types. Dual-write → remove legacy path. Same playbook applies. |
| #797 Sync-before-ExecutionDone | Validates "events are hints, not state." The fix enforces doc sync before broadcast — same principle for CommSync elimination. |
| #755, #789 Python reads from doc | Python already ignores Output broadcasts and reads from the doc via confirm_sync() + get_cells(). Same pattern extends to get_comms(). |
Implementation plan
Phase A: Schema + dual-write — #808
- Add
commsmap toNotebookDoc::new(),migrate_v2_to_v3(), bumpschema_versionto 3 - Add
put_comm,update_comm_state,remove_comm,get_comms,clear_commsmethods - For OutputModel:
append_comm_output,clear_comm_outputs - Daemon dual-writes to
doc.commsANDCommStateoncomm_open/comm_msg(update)/comm_close - Keep
CommSyncas fallback — no behavior changes - Size: Medium
Phase B: Frontend + Python read from doc — #809
- Add
get_comms()to WASMNotebookHandleandnotebook-syncDocHandle - Frontend watches
doc.commsaftersync_applied, drivesWidgetStorefrom doc state - Python
session.get_widgets()reads from doc afterconfirm_sync() CommSyncstill sent as backup during transition- Size: Medium-Large
Phase C: Eliminate parallel paths — #810
- Remove
CommSync,Comm(open/update/close),Output,OutputsCleared,DisplayUpdatebroadcast variants - Output widget captured outputs →
doc.comms[widget_id].outputs(same blob pipeline as cell outputs) update_display_datascansdoc.comms[*].outputsin addition to cells- Reduce
CommStatetoOutputCaptureRouter(capture routing logic only) - Apply sync-before-event pattern for
execution_started(like fix(daemon): sync doc to peer before forwarding ExecutionDone #797) - Size: Large — this is where the real simplification lands
Phase D: Binary unification + update_comm — #811
- Widget buffers through blob store:
{"$blob": "<hash>"}sentinels in state JSON - New
UpdateComm { comm_id, state_delta }request type (clean path for frontend→kernel state changes) - Daemon coalesces rapid
update_commrequests (16ms window) to bound CRDT history growth - Frontend optimistic updates reconcile with doc sync
SendCommretained only formethod: "custom"messages- Size: Medium-Large
Edge cases
clear_output(wait=True): Daemon buffers one output, clears + appends atomically on next output. Logic stays daemon-side, writes to CRDT instead of sending custom message.- High-frequency updates (slider drag, play widget): Coalesced by daemon (16ms window). Frontend optimistic local state for immediate feedback.
- Kernel session scoping:
doc.clear_comms()on kernel shutdown. Compaction opportunity. ipywidgets 8 echo_update: CRDT approach makes it unnecessary — merge of identical state is a no-op.- anywidget AFM interface:
model.get(key)→ read from doc.model.set(key, value)→UpdateCommrequest.model.on("change:key", cb)→ watch doc changes.model.send()→SendComm(irreducible stream). - Container widget ordering:
seqfield ensuresget_comms()returns widgets in creation order for dependency-correct replay (layouts must be instantiated after their children).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
architectureArchitecture proposals and structural changesArchitecture proposals and structural changesenhancementNew feature or requestNew feature or requestipywidgetsWidget rendering, comm protocol, Output widgetsWidget rendering, comm protocol, Output widgetssyncAutomerge CRDT sync protocolAutomerge CRDT sync protocol