Skip to content

Add parcels._datasets.remote#2574

Merged
VeckoTheGecko merged 43 commits intoParcels-code:mainfrom
VeckoTheGecko:open-dataset
Apr 13, 2026
Merged

Add parcels._datasets.remote#2574
VeckoTheGecko merged 43 commits intoParcels-code:mainfrom
VeckoTheGecko:open-dataset

Conversation

@VeckoTheGecko
Copy link
Copy Markdown
Contributor

@VeckoTheGecko VeckoTheGecko commented Apr 13, 2026

Description

This PR adds tooling for ingesting remote datasets that are:

  • For different purposes
    • testing or tutorials
  • Backed by netcdf files or (the new approach) backed by a zipped Zarr store
    • This requires minimal changes to the original data repo (except adding a new folder data-zarr for when we add zipped zarr files

This PR also:

  • Removes the download_example_dataset util (and renames the list_example_dataset util)
  • Removes the data downloading utils from the root parcels namespace
  • Creates a parcels._datasets.remote module as well as a thin wrapper module parcels.tutorial
  • Adds a function parcels._datasets.remote.open_dataset which returns an xarray dataset
  • So that CI passes, updates all references across the codebase (except references in v3 folders that are yet to be migrated)

Checklist

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.
    • Describe how you used it (e.g., by pasting your prompt): Writing docstrings and updating examples to use new ingestion method

def open_dataset(self) -> xr.Dataset: ...


class _V3Dataset(_ParcelsDataset):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that this approach was better than migrating the datasets to Zarr (i.e., working closer to the source netcdf files, without having to think about deprecation cycles and data storage etc.)

Copy link
Copy Markdown
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; a few small comments below

@VeckoTheGecko VeckoTheGecko enabled auto-merge (squash) April 13, 2026 15:50
It previously didn't respect the purpose since it relied on the KeyError
@VeckoTheGecko VeckoTheGecko merged commit ffb4f2f into Parcels-code:main Apr 13, 2026
12 of 13 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Parcels development Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Update tutorial dataset storage and loading

2 participants