You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Resolve CRS model type via pyproj instead of EPSG number range (#2277) (#2280)
* Fix CRS model type guess from EPSG number range (#2277)
The GeoTIFF writer decided GeographicTypeGeoKey vs ProjectedCSTypeGeoKey
from the EPSG number range (4326 plus 4000-4999). Anything else got
written as projected. That silently mis-tagged geographic CRSes
registered outside the legacy block -- 6318 (NAD83(2011)), 7844
(GDA2020), 9057 (WGS 84 (G2139)), 8252 (NAD83(FBN)), etc. -- which
corrupts the CRS at write time.
The writer now consults pyproj.CRS.from_epsg(...).is_geographic when
pyproj is available. When pyproj is not installed and the code falls
outside the hard-coded fallback set (4326 + 4000-4999), the writer
raises UnknownCRSModelTypeError instead of guessing. Silent CRS
corruption is worse than an explicit error at write time.
New typed error UnknownCRSModelTypeError lives next to the existing
GeoTIFFAmbiguousMetadataError family and is exported from
xrspatial.geotiff.
Tests cover the codes that previously regressed (6318, 7844, 9057,
8252), the legacy-range codes (4326, 4269, 4267), real projected codes
(32610, 3857, 2193), the pyproj-missing fail-closed path, and the
pyproj-installed-but-unknown-code path.
* Address review: tighten EPSG fallback, CHANGELOG, pyproj extra (#2277)
- Narrow `_KNOWN_GEOGRAPHIC_EPSG_FALLBACK` from the full 4000-4999
block to a vetted allowlist (4326, 4269, 4267, 4258, 4283, 4322,
4230, 4019, 4047). The 4000-4999 range contained projected codes
(4087/4088 World Equidistant Cylindrical, 4499 CGCS2000 GK 21) that
the old fallback would have mis-tagged as geographic, the exact
dual of the bug this PR is fixing. Without pyproj, classify only
what we can manually verify.
- Add EPSG 4087/4088/4499 regression test so the projected-inside-
legacy-range case stays covered.
- Drop the noise lines from the new tests: the fragile `flat.index`
assertion and the dummy `assert _geotags is not None`.
- Refresh the helper's docstring / inline comment to mention both
fallback paths (pyproj-missing AND pyproj-installed-but-DB-failed).
- Append the new `UnknownCRSModelTypeError` to the public-API
contract test in test_features.py so it stays in `__all__`.
- Add `pyproj` to the `geotiff` extra in setup.cfg so installing
`xarray-spatial[geotiff]` picks up the dependency the writer now
prefers.
- CHANGELOG entry under Unreleased calls out the behaviour change.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,7 @@
5
5
### Unreleased
6
6
7
7
#### Bug fixes and improvements
8
+
- Resolve the GeoTIFF writer's `GeographicTypeGeoKey` / `ProjectedCSTypeGeoKey` decision via pyproj instead of an EPSG number range. The legacy heuristic (4326 + 4000-4999 -> geographic, else projected) silently mis-tagged geographic CRSes registered outside the 4000-4999 block (NAD83(2011) = 6318, GDA2020 = 7844, WGS 84 (G2139) = 9057, etc.) as projected and projected codes inside the block (4087 / 4088 / 4499) as geographic, corrupting the CRS at write time. The writer now calls `pyproj.CRS.from_epsg(...).is_geographic`. When pyproj can't classify a code (uninstalled, or installed but the local PROJ database lacks the entry), the writer raises the new `UnknownCRSModelTypeError` rather than guessing -- a small vetted allowlist (4326, 4269, 4267, 4258, 4283, 4322, 4230, 4019, 4047) is still honoured for the pyproj-missing case. `pyproj` is now listed under the `geotiff` extra. (#2277)
8
9
- Shut down the per-tile compression `ThreadPoolExecutor` on every exit path of the streaming tiled-write code in `to_geotiff`. The old code only called `shutdown(wait=True)` after the tile-row loop completed, so any mid-stream raise (compression failure, dask compute failure, file write failure) bypassed shutdown and leaked worker threads. The loop now runs inside `try/finally` and the finally calls `shutdown(wait=True, cancel_futures=True)` so queued tiles get dropped on the error path instead of blocking the unwind. The pool's workers carry an `xrspatial-geotiff-tile-compress``thread_name_prefix` so leak-detection tests can tell them apart from dask's own offload/scheduler pools. (#2276)
9
10
- Remove read-side emission of the 13 deprecated GeoTIFF attrs (`crs_name`, `geog_citation`, `datum_code`, `angular_units`, `semi_major_axis`, `inv_flattening`, `linear_units`, `projection_code`, `vertical_crs`, `vertical_citation`, `vertical_units`, `colormap_rgba`, `cmap`) and bump `attrs['_xrspatial_geotiff_contract']` from 1 to 2. Downstream code that read these via `attrs[key]` now sees `KeyError`; migrate to `attrs.get(key)` or derive the value from `attrs['crs']` / `attrs['crs_wkt']` with pyproj. The `.xrs.plot()` accessor still surfaces palette colormaps by building a `ListedColormap` from the canonical `attrs['colormap']`. (#2016)
10
11
- Accept numpy integer scalars as the `crs=` argument to `to_geotiff` / `write_geotiff_gpu`. The validator already allowed `numbers.Integral`, but the writers gated EPSG assignment on `isinstance(crs, int)`, so `np.int32` / `np.int64` / `np.uint16` values passed validation then silently fell through with no EPSG written. (#2082)
0 commit comments