Block multiple sled reservations with the same gen#10479
Open
jmpesp wants to merge 1 commit into
Open
Conversation
If multiple instance-start sagas are concurrently attempting to allocate for the same instance, this temporarily results in multiple rows in `sled_resource_vmm` with different propolis ids for the same instance id. One of the instance-start sagas will succeed, where the other(s) will unwind (due to an "instance changed state before it could be started" error from `sis_move_to_starting`), and remove the `sled_resource_vmm` record that they added by matching on that saga's propolis id. There's never been a uniqueness constraint for instance id in the `sled_resource_vmm` table, because there can't be, otherwise we'd never be able to migrate an instance (which makes a new record on a different sled for the same instance). For an instance start that performs any new local storage allocation, this is a problem: the latent assumption in inserting / updating local storage related records is that this type of duplication could not occur, that if the insert succeeded then it means the allocation will only be performed once. Because this is not true the CTE will happily stomp all over the local storage allocation related records and that leads to the orphaning seen in the linked issue. The fix is to add a uniqueness constraint to `sled_resource_vmm` that ensures only one record for a given instance id plus the instance state generation number exists. This will not affect migration because the instance state generation is bumped in that case. This commit also changes the local storage related unit tests to clearly specify the ncpus and memory for the fake instances, as inspecting the `sled_resource_vmm` records produced by the test showed the resources didn't match the instance specification. Fixes oxidecomputer/customer-support#1184.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If multiple instance-start sagas are concurrently attempting to allocate for the same instance, this temporarily results in multiple rows in
sled_resource_vmmwith different propolis ids for the same instance id. One of the instance-start sagas will succeed, where the other(s) will unwind (due to an "instance changed state before it could be started" error fromsis_move_to_starting), and remove thesled_resource_vmmrecord that they added by matching on that saga's propolis id.There's never been a uniqueness constraint for instance id in the
sled_resource_vmmtable, because there can't be, otherwise we'd never be able to migrate an instance (which makes a new record on a different sled for the same instance).For an instance start that performs any new local storage allocation, this is a problem: the latent assumption in inserting / updating local storage related records is that this type of duplication could not occur, that if the insert succeeded then it means the allocation will only be performed once. Because this is not true the CTE will happily stomp all over the local storage allocation related records and that leads to the orphaning seen in the linked issue.
The fix is to add a uniqueness constraint to
sled_resource_vmmthat ensures only one record for a given instance id plus the instance state generation number exists. This will not affect migration because the instance state generation is bumped in that case.This commit also changes the local storage related unit tests to clearly specify the ncpus and memory for the fake instances, as inspecting the
sled_resource_vmmrecords produced by the test showed the resources didn't match the instance specification.Fixes oxidecomputer/customer-support#1184.