Skip to content

[multicast,instance,test-flake] reorder instance_stop and send stop before tearing down multicast member state#10402

Open
zeeshanlakhani wants to merge 2 commits into
mainfrom
zl/flake-test_join_by_ip_existing_group
Open

[multicast,instance,test-flake] reorder instance_stop and send stop before tearing down multicast member state#10402
zeeshanlakhani wants to merge 2 commits into
mainfrom
zl/flake-test_join_by_ip_existing_group

Conversation

@zeeshanlakhani
Copy link
Copy Markdown
Collaborator

Handles (closes) #9711 (was fixed downstream in a PR'ed branch).

…efore tearing down multicast member state

Handles (closes) #9711 (was fixed downstream in a PR'ed branch).
@zeeshanlakhani
Copy link
Copy Markdown
Collaborator Author

@jgallagher minor one up here btw.

Comment thread nexus/src/app/instance.rs
opctx,
InstanceUuid::from_untyped_uuid(authz_instance.id()),
)
.await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this query fails? (E.g., no db connections are available or something spurious)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch, as I wrapped this up "too quickly" and it had some varying semantics.

With this change, we'd 500 the caller for a stop that already succeeded (short-circuiting past the reconciler activation). State is self-healing due to the multicast reconciler, but the visible failure for a successful op is the regression vs before (though before had your original issue in the test).

Rather than quick-fix it, I'm going to move the detach out of instance_stop and into the instance_update saga next to the existing migration-time multicast sled_id update block and gate on the deprovisioning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants