Handle concurrently deleted local storage dataset by jmpesp · Pull Request #10480 · oxidecomputer/omicron

jmpesp · 2026-05-21T19:37:38Z

Surface if a dataset is gone, and instead of returning a 500, match on the new DatasetNotFound variant and return Ok() from delete_local_storage_dataset.

Fixes #10477

Surface if a dataset is gone, and instead of returning a 500, match on the new DatasetNotFound variant and return Ok() from `delete_local_storage_dataset`. Fixes oxidecomputer#10477

hawkw · 2026-05-21T20:49:14Z

    },
 }

+impl From<SetValueErrorRaw> for EnsureDatasetErrorRaw {


Part of me kind of wonders if we might want to implement DatasetNotFound as a generic wrapper around any other error that's either the inner error or DatasetNotFound, and if that might reduce some boilerplate here? Something like:

#[derive(thiserror::Error)] enum DatasetError<T> { #[error("Dataset not found")] NotFound, #[error(transparent)] Other(#[from] T) }

although, it's equally possible that this is actually more annoying than just doing it the current way

hawkw · 2026-05-21T20:50:28Z

+    #[error("Dataset not found")]
+    DatasetNotFound,
+
    #[error("ZFS execution error")]
    Execution(#[from] crate::ExecutionError),

    #[error("Unexpected output from ZFS commands: {0}")]
    Output(String),

    #[error("Dataset does not exist")]
    DoesNotExist,


what's the difference between DatasetNotFound and the DoesNotExist error which was already here? should that one be removed and should we be erturning DatasetNotFound instead?

if they have different semantics, it would be nice to have a comment explaining that (and maybe naming that makes the distinction clearer?), because this sure looks confusing.

hawkw · 2026-05-21T20:56:47Z

+                crate::ExecutionError::CommandFailure(info)
+                    if info.stderr.contains("does not exist") =>
+                {


i kinda wonder if we ought to have an execute_async wrapper for zfs commands, or something, that does this detection and returns either an ExecutionError or a DatasetNotFound, rather than forcing every caller to detect it individually? the caller-side detection seems sketchier to me, because now you have to remember to check for this contents in info.stderr every time you call a zfs command. meanwhile, having the function that runs the command do it gives the caller an enum which they are forced to handle.

take it or leave it, just a thought.

hawkw · 2026-05-21T20:57:29Z

+                    values: name_values
+                        .iter()
+                        .map(|(k, v)| format!("{k}={v}"))
+                        .join(","),


we are now doing this twice, btw --- you have a

let values = name_values.iter().map(|(k, v)| format!("{k}={v}")).join(",");

at the top of the map_err closure...

Suggested change

values: name_values

.iter()

.map(|(k, v)| format!("{k}={v}"))

.join(","),

values,

hawkw · 2026-05-21T20:59:56Z

+                RemoveReservationErrorInner::DatasetNotFound => {
+                    // If the parent dataset is no longer found, then a
+                    // concurrent deletion occurred. Return Ok
+                    return Ok(());
+                }


hm, is this really the only place we might encounter a dataset that was already removed by a concurrent deletion? should we also handle this error at the point where we actually delete the child volume dataset, too?

Handle concurrently deleted local storage dataset

d2d7311

Surface if a dataset is gone, and instead of returning a 500, match on the new DatasetNotFound variant and return Ok() from `delete_local_storage_dataset`. Fixes oxidecomputer#10477

jmpesp requested a review from leftwo May 21, 2026 19:37

hawkw reviewed May 21, 2026

View reviewed changes

leftwo approved these changes May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle concurrently deleted local storage dataset#10480

Handle concurrently deleted local storage dataset#10480
jmpesp wants to merge 1 commit into
oxidecomputer:mainfrom
jmpesp:handle_gone_local_storage_dataset

jmpesp commented May 21, 2026

Uh oh!

hawkw May 21, 2026

Uh oh!

hawkw May 21, 2026

Uh oh!

hawkw May 21, 2026

Uh oh!

hawkw May 21, 2026

Uh oh!

hawkw May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jmpesp commented May 21, 2026

Uh oh!

hawkw May 21, 2026

Choose a reason for hiding this comment

Uh oh!

hawkw May 21, 2026

Choose a reason for hiding this comment

Uh oh!

hawkw May 21, 2026

Choose a reason for hiding this comment

Uh oh!

hawkw May 21, 2026

Choose a reason for hiding this comment

Uh oh!

hawkw May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants