[multicast,instance,test-flake] reorder instance_stop and send stop before tearing down multicast member state#10402
Conversation
…efore tearing down multicast member state Handles (closes) #9711 (was fixed downstream in a PR'ed branch).
|
@jgallagher minor one up here btw. |
| opctx, | ||
| InstanceUuid::from_untyped_uuid(authz_instance.id()), | ||
| ) | ||
| .await?; |
There was a problem hiding this comment.
What happens if this query fails? (E.g., no db connections are available or something spurious)
There was a problem hiding this comment.
Yeah, good catch, as I wrapped this up "too quickly" and it had some varying semantics.
With this change, we'd 500 the caller for a stop that already succeeded (short-circuiting past the reconciler activation). State is self-healing due to the multicast reconciler, but the visible failure for a successful op is the regression vs before (though before had your original issue in the test).
Rather than quick-fix it, I'm going to move the detach out of instance_stop and into the instance_update saga next to the existing migration-time multicast sled_id update block and gate on the deprovisioning.
Handles (closes) #9711 (was fixed downstream in a PR'ed branch).