Skip to content

Add sieve filter to remove small raster clumps#1159

Merged
brendancol merged 2 commits intomasterfrom
issue-1149
Apr 2, 2026
Merged

Add sieve filter to remove small raster clumps#1159
brendancol merged 2 commits intomasterfrom
issue-1149

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • New sieve() function: removes small connected clumps from classified rasters by replacing regions below a pixel-count threshold with their largest neighbor's value
  • 4- and 8-connectivity, selective sieving via skip_values, all four backends (numpy and dask+numpy native, cupy and dask+cupy with CPU fallback)
  • 40 tests: correctness, edge cases, NaN handling, validation, cascading merges, dask memory guards

Details

Labels connected components per unique value with scipy.ndimage.label, builds a region adjacency graph from vectorized array shifts, then merges the smallest regions into their largest spatial neighbor until everything meets the threshold.

Dask paths compute the full array into memory first (same as regions()) because connected-component labeling is a global operation. Memory guards prevent OOM on large arrays.

New files:

  • xrspatial/sieve.py -- implementation
  • xrspatial/tests/test_sieve.py -- 40 tests
  • examples/user_guide/48_Sieve_Filter.ipynb -- user guide notebook

Modified files:

  • xrspatial/__init__.py -- added sieve export
  • docs/source/reference/zonal.rst -- added API entry
  • README.md -- added row to Morphological feature matrix

Closes #1149

Test plan

  • pytest xrspatial/tests/test_sieve.py -- 40/40 passing
  • Verify notebook renders correctly
  • Verify docs build with make html

Implements a sieve() function that identifies connected components of
same-value pixels and replaces regions smaller than a threshold with
the value of their largest spatial neighbor.  Supports 4- and
8-connectivity, selective sieving via skip_values, and all four
backends (numpy, cupy via CPU fallback, dask+numpy, dask+cupy).
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Apr 1, 2026
@brendancol brendancol merged commit 0914a9c into master Apr 2, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add sieve filter to remove small raster clumps

1 participant