Use compact runtime shape refs internally#4624
Conversation
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Claude Code ReviewSummaryThis PR introduces in-memory numeric What's Working Well
Issues FoundCritical (Must Fix)1. File:
The previous review treated this as "purely a leak (routing stays correct)". That conclusion is incomplete. The shape subsystem ( After such a reconnect, any code that resolves an existing shape's
(Steady-state change routing collector->consumer happens to keep working because both sides hold the same old ids internally — the break is specifically at every fresh The unbounded leak the previous review described is also still real and stacks on top: the old Suggested fix: don't re-mint ids for shapes that already have one. On
Suggestions (Nice to Have)2. File:
3. Add reconcile/concurrency coverage for the identity layer (unchanged, and now directly testable) The new Issue ConformanceStill no linked issue. The PR description explains the rationale (cut the memory cost of holding millions of binary handles in ETS/maps/sets) and the implementation matches the described scope, including the deliberate scope cuts ( Previous Review StatusIteration 5 -> 6 (only new commit:
Review iteration: 6 | 2026-06-19 |
c0b57f3 to
c82a144
Compare
alco
left a comment
There was a problem hiding this comment.
I have only reviewed a part of it.
So far, the addition of optionality between shape_ref and shape_handle in multiple modules/functions looks like added complexity we would like to avoid. What's the heuristic I should use when reading the code to know whether a given function can end up taking a shape ref or a shape handle at runtime?
|
|
||
| def add_shape(stack_id, shape_handle, shape, :create) | ||
| when is_stack_id(stack_id) and is_shape_handle(shape_handle) do | ||
| GenServer.call( |
There was a problem hiding this comment.
This should just call the local add_shape() function to avoid duplicating GenServer.call() and the timeout magic number.
| | to_add: Map.put(state.to_add, shape_handle, shape), | ||
| to_remove: MapSet.delete(state.to_remove, shape_handle), | ||
| to_schedule_waiters: Map.put(state.to_schedule_waiters, shape_handle, from) | ||
| | to_add: Map.put(state.to_add, shape_ref, {shape_handle, shape}), |
There was a problem hiding this comment.
Hmm, so these maps and sets may now have to shapes of values? I feel like this change is missing a better in-code documentation clarifying the tradeoffs. For an uninformed reader it looks odd to have similar handle_call clauses where one has the additional shape_handle param.
| end | ||
| end | ||
|
|
||
| defp shape_handle_for_ref(stack_id, shape_id) when is_shape_id(shape_id) do |
There was a problem hiding this comment.
Why the double naming of shape_ref and shape_id referring to the same thing?
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Convert the internal routing pipeline from the binary shape_handle to the numeric shape_id in one atomic change. ConsumerRegistry, EventRouter/Filter, ShapeLogCollector (+ RequestBatcher/FlushTracker), Partitions and DependencyLayers are now keyed by id, along with the consumer's registration name, flush notifications and the removal path. Client-facing Registry pub-sub, Storage, PublicationManager, the subquery index seeding/mark_ready and the cleaner's deferred phase remain keyed by handle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the Chunk 3 transitional mixed-key state: the subquery index's
registration side was keyed by shape_id while seeding/membership was still
keyed by shape_handle, leaving every subquery shape permanently in routing
fallback. Flip the remaining index callers (seed_membership, mark_ready,
add_value, remove_value) and the index's own parameters to shape_id so the
exact-membership reader (membership_or_fallback?/member?) matches the
seeded entries again.
Also key the RefResolver and the materializer (process name, link-values
ETS cache, and the {:materializer_changes, dep_id, ...} routing message) by
shape_id, resolving dependency handle -> id at the consumer boundary.
move_broadcast.ex and subquery_tags.ex stay on shape_handle: their tag hash
md5('stack_id' || 'shape_handle' || value) must byte-match the Postgres side
and is streamed/persisted in tag headers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r_log Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
449886b to
6655b70
Compare
Summary
Introduces an in-memory numeric
shape_id(the "runtime shape ref") that replaces the binaryshape_handlefor all internal routing, filtering, and identity. Shapes are held millions of times across ETS tables, maps, and sets; eachshape_handleis a ~30-byte binary, so swapping the internal key to a small integer is a substantial memory win — the largest amplifier being the subquery index, where the key was duplicated 4–6× per node per shape.shape_handleis preserved at every external/durable boundary: the HTTP API, on-disk + storage-ETS keys, theShapeStatusauthority's public face, the client change-notificationRegistry, and the subquery tag-hash. Theshape_idis in-memory only — never persisted, freshly minted each boot.How to review
The change is deliberately structured so most of it is trivial to review. It breaks into four parts:
1. Mechanical changes (the bulk — ~17 of 23 lib files are 90–100% mechanical)
Pure
shape_handle→shape_idrenames: function params,@specs, ETS tuple keys, message tuples, local vars, andis_shape_handle→is_integerguard swaps. No behaviour change. These modules are key swaps only, no algorithm change:consumer_registry.ex,event_router.ex,filter.ex,filter/indexes/subquery_index.ex,materializer.ex,where_clause.ex,request_batcher.ex,partitions.ex,dependency_layers.ex(algorithm unchanged — only the caller now supplies resolved ids),ref_resolver.ex,state.ex,effects.ex,setup_effects.ex,event_handler/subqueries/{steady,buffering}.ex,dynamic_consumer_supervisor.ex(partition hash now over id — behaviour-equivalent).2. Core logic —
shape_status.ex(the one place to read closely)This is the feature. ShapeStatus becomes the authority that mints ids and holds the bidirectional map:
shape_id_tableETS (id → handlereverse map) +mint_id/1(atomic:ets.update_counteron a:seqrow).id_for_handle/2,handle_for_id/2,fetch_shape_by_id/2, andshape_handle_for_log/2(returns"unknown, id: N"on miss so log/telemetry callers can resolve unconditionally).shape_meta_tablerow grows to a 6-tuple (id appended); ids are minted both inadd_shape/2(new shapes) andpopulate_shape_meta_table/2(restore/boot), and the reverse-map row is cleaned up inremove_shape/2.The id is never written to SQLite — the
shape_db/connection/querypersistence layer is untouched.3. Planned boundary (handle ↔ id resolution, as designed)
Resolution happens once at each edge, never on the steady-state hot path:
shape_log_collector.exdependency_ids/2andevent_handler_builder.exdep_id_for_handle!/2— resolve a shape's persisted dependency handles → ids at consumer setup.consumer.exconsumer_pid_for_handle/2— the single handle→id resolver for cold-path/handle-boundary callers; routing-identity functions (name,stop,consumer_pid) are id-only.shape_handle_for_log/2used at id-only log/telemetry sites.Deliberately kept on
shape_handle: the API/plug boundary, allStorage.*and on-disk paths, the client ElixirRegistrypub-sub (register_for_changes/Registry.dispatch),PublicationManager/RelationTracker(one-entry-per-shape, low memory value), and the subquery tag-hash (subquery_tags.ex/move_broadcast.exare unchanged — theirmd5("#{stack_id}#{shape_handle}…")is streamed/persisted in:tagsheaders and must byte-match the Postgres-side hash inquerying.ex).4. Implementation-time additions (the only non-mechanical extras — review these)
Almost all are the unavoidable consequence of introducing a lookup: once a path resolves handle→id, it must handle the lookup missing (a shape concurrently deleted):
shape_cache.exrestore path (:error -> halt),shape_cleaner.ex(resolve id beforeShapeStatus.remove_shapedeletes the mapping; nil-guard helpersstop_consumer/3,remove_from_shape_log_collector/2; suspend-path:error -> :ok),consumer.exall_materializers_alive?.shape_cache.exstart_shape/4+start_materializers/3restructured to awithchain: a materializer-id miss now aborts and purges before starting the consumer, instead of starting both regardless (symmetric graceful handling).snapshotter.exlooks the consumer up viaConsumerRegistry.whereis(id)directly so a snapshot-failure report still reaches it after cleanup has removed the handle→id mapping.shape_status.exput_handle_id/3— aMix.env() == :test-gated ETS seeder for routing tests (compile-gated; cannot reach prod).Guarantees verified
shape_idnever reaches the wire (noshape_idinapi/,plug/,log_items.ex, or:tags/electric-handleheaders).ConsumerRegistry= id; client ElixirRegistry= handle).Testing
Full suite green: 391 doctests, 9 properties, 1833 tests, 0 failures. Subquery move-in/move-out integration and
electric-handleheader round-trip covered by existing suites.