Skip to content

Fix flaky PublicationManager relation tracker restart test#4635

Open
erik-the-implementer wants to merge 2 commits into
mainfrom
fix/4606-relation-tracker-restart-race
Open

Fix flaky PublicationManager relation tracker restart test#4635
erik-the-implementer wants to merge 2 commits into
mainfrom
fix/4606-relation-tracker-restart-race

Conversation

@erik-the-implementer

Copy link
Copy Markdown
Contributor

Fixes #4606.

Problem

The PublicationManagerTest "handles relation tracker restart" test is flaky. It stops the RelationTracker (the supervisor then restarts it) and relies on assert_pub_tables as a "wait for restart" barrier — but it isn't one.

The relation is already in the publication and is not removed during the restart, so assert_pub_tables(ctx, [ctx.relation], …) passes immediately, potentially before the supervisor has restarted and re-registered the process. The subsequent PublicationManager.remove_shape/2 call (→ GenServer.call(name(stack_id), …)) then races with a "no process" exit.

Fix

Wait for the RelationTracker to be re-registered, then for it to finish restoring its filters from ShapeStatus, before issuing further calls:

assert wait_until(fn -> is_pid(GenServer.whereis(relation_tracker_name)) end, 2_000)
:ok = PublicationManager.wait_for_restore(ctx.stack_id, timeout: 2_000)

Rather than add a bespoke helper, this promotes the existing private wait_until poller (previously in shape_cache_test.exs) into the shared Support.TestUtils as wait_until/2, and migrates its original call site to the shared version.

Verification

15/15 seeded runs of the target test pass; the full publication_manager_test.exs (26 tests) and the migrated shape_cache_test.exs test are green. Test-only change, so no changeset.

🤖 Generated with Claude Code

alco and others added 2 commits June 19, 2026 11:53
Fixes #4606.

The "handles relation tracker restart" test stopped the RelationTracker and
relied on assert_pub_tables as a "wait for restart" barrier. It isn't one: the
relation is already in the publication and isn't removed during the restart, so
the assertion passes immediately - potentially before the supervisor has
restarted and re-registered the process. The subsequent remove_shape call then
raced with a "no process" exit.

Wait for the RelationTracker to be re-registered and to finish restoring its
filters before issuing further calls.

Reuse the existing wait-on-condition idiom rather than adding a bespoke helper:
promote the generic poller to Support.TestUtils as wait_until/2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.46%. Comparing base (ee0da19) to head (2f90aab).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4635   +/-   ##
=======================================
  Coverage   59.46%   59.46%           
=======================================
  Files         385      385           
  Lines       43039    43039           
  Branches    12383    12383           
=======================================
  Hits        25591    25591           
- Misses      17371    17373    +2     
+ Partials       77       75    -2     
Flag Coverage Δ
packages/agents 72.64% <ø> (ø)
packages/agents-mcp 77.70% <ø> (ø)
packages/agents-mobile 80.67% <ø> (ø)
packages/agents-runtime 83.46% <ø> (ø)
packages/agents-server 75.45% <ø> (-0.03%) ⬇️
packages/agents-server-ui 7.51% <ø> (ø)
packages/electric-ax 51.06% <ø> (ø)
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (ø)
packages/typescript-client 91.94% <ø> (+0.23%) ⬆️
packages/y-electric 56.05% <ø> (ø)
typescript 59.46% <ø> (ø)
unit-tests 59.46% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@alco alco self-assigned this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: PublicationManagerTest "handles relation tracker restart" races the supervisor restart

2 participants