Describe the bug
to_geotiff streams a dask-backed DataArray one tile-row at a time and bounds peak memory with streaming_buffer_bytes (default 256 MB). The column-segmentation budget is sized from the output tile geometry:
# _writer.py:1030
bytes_per_tile_col = th * tw * bytes_per_sample * samples
That matches reality for a plain windowed read. It does not match a map_overlap source. When the array is open_geotiff(chunks=...).xrs.slope() (same for aspect, curvature, hillshade), slicing one tile-row dask_data[r0:r1, seg].compute() materializes every source chunk-row the band intersects plus a one-chunk halo — not the tile-row's output footprint.
So when the source chunks are taller than the tile (e.g. chunks=512 read, tile_size=256) and the raster is wide (LANDFIRE CONUS), each compute pulls several full-width source chunk-rows. The 256 MB budget never sees that, the column segmentation never tightens enough, and the compute OOMs.
Measured on a 2048x8192 float32 raster read at chunks=512, slope, one 256-row output strip:
- output strip the budget assumes: 8 MB
- source reads actually materialized: 32 MB (2 chunk-rows x 16 col-chunks; the depth-1 halo pulls the neighbor chunk-row)
The overshoot scales with source_chunk_height / tile_height and with raster width, so a tall chunk on a CONUS-width raster is multiple GB per compute.
Expected behavior
Peak materialized memory per compute stays within streaming_buffer_bytes regardless of how the source was chunked. The column budget should come from the source chunk geometry — the source chunk-rows a tile-row band touches, plus a one-chunk halo — instead of the output tile height.
Repro
import numpy as np, xarray as xr, dask.array as da
from xrspatial import slope
from xrspatial.geotiff import to_geotiff
base = da.random.random((2048, 8192), chunks=(512, 512)).astype('float32')
agg = xr.DataArray(base, dims=['y', 'x'])
to_geotiff(slope(agg), 'slope.tif') # each tile-row pulls ~4x the budgeted bytes
Additional context
- Default
cog=False streaming path, _writer.py:1024-1099.
- Workaround today: read with short chunks matching the tile, e.g.
open_geotiff(path, chunks=256).
- Strip layout (
tiled=False) does no horizontal segmentation at all; bounding its peak needs a separate change and is out of scope here.
Describe the bug
to_geotiffstreams a dask-backed DataArray one tile-row at a time and bounds peak memory withstreaming_buffer_bytes(default 256 MB). The column-segmentation budget is sized from the output tile geometry:That matches reality for a plain windowed read. It does not match a
map_overlapsource. When the array isopen_geotiff(chunks=...).xrs.slope()(same for aspect, curvature, hillshade), slicing one tile-rowdask_data[r0:r1, seg].compute()materializes every source chunk-row the band intersects plus a one-chunk halo — not the tile-row's output footprint.So when the source chunks are taller than the tile (e.g.
chunks=512read,tile_size=256) and the raster is wide (LANDFIRE CONUS), each compute pulls several full-width source chunk-rows. The 256 MB budget never sees that, the column segmentation never tightens enough, and the compute OOMs.Measured on a 2048x8192 float32 raster read at
chunks=512, slope, one 256-row output strip:The overshoot scales with
source_chunk_height / tile_heightand with raster width, so a tall chunk on a CONUS-width raster is multiple GB per compute.Expected behavior
Peak materialized memory per compute stays within
streaming_buffer_bytesregardless of how the source was chunked. The column budget should come from the source chunk geometry — the source chunk-rows a tile-row band touches, plus a one-chunk halo — instead of the output tile height.Repro
Additional context
cog=Falsestreaming path,_writer.py:1024-1099.open_geotiff(path, chunks=256).tiled=False) does no horizontal segmentation at all; bounding its peak needs a separate change and is out of scope here.