[2/3] Add assign_stitch_groups (tile-cut stitching algorithm)#1193
[2/3] Add assign_stitch_groups (tile-cut stitching algorithm)#1193timtreis wants to merge 17 commits into
Conversation
|
I have a problem with the uns keys we give so far like: .uns["tiling_stitch"]. Can't we make a rule that whatever we write to uns as metadata we use the name of the function that wrote it? I would be more obvious imo. Also a test fails atm |
selmanozleyen
left a comment
There was a problem hiding this comment.
There can be probably more performance issues but I think we just need to see how this performs in real life. So we can tackle the real bottlenecks
Add sq.experimental.tl.assign_stitch_groups + StitchParams: groups tile-cut cell pieces flagged by calculate_tiling_qc by pairing facing cut edges and scoring each pair with a transparent weighted mean of five geometric features (iou, endpoint_match, merge_compactness, merge_solidity, gap_proximity). Only .obs columns + a tiling_stitch audit block are written; the labels element is never modified. Weights default flat-equal and are tunable via feature_weights; no coefficients are fitted or shipped. Also: - wire the calculate_tiling_qc hook warning about dropped stitch columns - extract shared helpers (resolve_params, equivalent_diameter, largest_contour, iter_chunked_regionprops) into experimental/utils, reused by _tiling_qc and _tiling to remove duplication; QC numerics unchanged (visual baselines pass) - save_diagnostics writes a zarr-safe per-pair dict-of-arrays to .uns - clamp merge_solidity and keep gap_proximity neutral when closing is disabled Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lock the validation-sweep outcome: at min_confidence=0.5 the deterministic fixture recovers >=50% of cut pieces with no intact false-merges. min_confidence default stays 0.7 (full attainable recall, zero false merges); gap_proximity kept in the 5-feature score. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two figures answering the reviewer's request to understand the stitch_confidence math: merge_compactness/merge_solidity separating a true cut from a false merge, and the min_confidence validation sweep (recall/precision over synthetic layouts). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop granular unit tests of private internals (gap_proximity arithmetic, feature_weights validation matrix, param-resolution already covered by PR-1, determinism, numpy coercion). Keep the obs/uns contract PR-C consumes, error paths, idempotency/inplace, the QC-rerun hook, multiscale, the diagnostics zarr round-trip, and the visual. 33 tests/552 lines -> 15 tests/259 lines. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… not the repo) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per /simplify: revert the merged _tiling.py and _tiling_qc.py back to main (keep only the _warn_if_dropping_stitch_columns hook the scverse#1170 review asked for); the shared util helpers are now consumed by the new stitch module only, no churn in merged files. Also drop the redundant score_formula string (it is derivable from the recorded feature_weights + score_features) and loop the per-feature diagnostics fill instead of one line per feature. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Slim the v1 surface to the core feature: stitch_confidence is the flat (unweighted) mean of the five geometric features; no feature_weights knob and no save_diagnostics path. Both can land later if a concrete need appears. Also drop the changelog entry (batch docs separately). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Consolidate the test file into one behavioural class + a visual class (matching test_tiling_qc.py) with a one-line module docstring. Drop the `# ----` banner dividers (used only by the recent tiling modules, in no core squidpy file) for plain section comments, and rename the "Stage N" section labels to plain descriptions (the numbering had a confusing gap). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the `# ----...` section-divider banners (used in only a few experimental files, in no core squidpy module) for plain single-line section comments. Comment-only change; no behaviour affected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… crops Two performance fixes from @selmanozleyen's review of assign_stitch_groups, both behavior-preserving (the existing public tests pin the results). - Early-prune in _score_pairs: compute an optimistic upper bound on the flat-mean score from the cheap geometry features (the two shape features are each <= 1) and skip the costly union reconstruction when even the best case can't reach min_confidence. Sound: the bound never underestimates the real score, so no passing pair is dropped. New unit test asserts the shape step is skipped for a below-threshold pair and still runs for a passing one. - Stop re-reading the labels array per pair: _extract_cut_edges already reads each outlier cell's bbox crop for contour tracing, so it now also returns a {label_id -> boolean bbox mask} dict. _merge_shape_features reconstructs the merge union in memory from those crops (placing each at its bbox offset, exact border clamping preserved) instead of fetching a fresh union crop from the (dask-backed) array on every candidate. Fetches are now bounded by the number of outlier cells, not the number of candidate pairs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
f65dd9b to
08e9fdf
Compare
…ake_stitched_labels xref) The -W docs build failed on two unresolved references in autodoc'd symbols: - Each StitchParams field docstring started with "Advanced: ...". napoleon parses a leading "word:" as a typed field, so it tried to cross-reference "Advanced" as a class. Drop the redundant prefix (the class docstring already frames these as advanced knobs); verified locally with an isolated sphinx build that the six warnings disappear. - assign_stitch_groups referenced make_stitched_labels, which is the unmerged PR-C function (not in the API docs). Suppress the link with the `!` prefix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The early-prune in _score_pairs hardcoded the scoring internals (the `+2.0` for the two shape features and the `/n_features` flat mean). Extract _max_achievable_score, built on _score_pair_features with the deferred _SHAPE_FEATURES assumed at their 1.0 max, so the bound and the real score share one definition and stay in sync if the feature set changes. Also dedupes the per-candidate feature dict (`known` reused for both the bound and the final score). Behavior unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Agree, the chopping into sub-PR thing bloated everything a bit. The final 3/3 PR now will hopefully tie everything back together so that we can actually optimise |
The StitchVisual_seam_group_recolor PlotTester had no committed baseline, so the stable CI jobs failed with 'Baseline image ... does not exist' (1 failed, 1036 passed). Add the baseline rendered by CI (ubuntu-latest artifact), per the project's reference-image workflow - never generated locally (a local macOS render differs by ~RMS 53 vs the tolerance 50 from font/AA platform variance; the Linux baseline matches the Linux CI run). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1193 +/- ##
==========================================
- Coverage 75.33% 75.30% -0.03%
==========================================
Files 56 59 +3
Lines 7922 8415 +493
Branches 1292 1392 +100
==========================================
+ Hits 5968 6337 +369
- Misses 1444 1524 +80
- Partials 510 554 +44
🚀 New features to boost your workflow:
|
The macOS CI render differed from the Linux baseline by RMS 53 (> tol 50), and the failed-diff showed the *whole* imshow region misaligned - including the "before" panel, which has no stitching, so it could only be a rendering shift, not an algorithm difference. Cause: tight_layout sizes the axes from the title text extents, which differ across platforms (fonts), shifting the image sub-pixel so every high-contrast cell edge mismatches. Fix the layout instead of papering over it with a high tolerance: drop the titles and tight_layout, pin the geometry with a fixed subplots_adjust. The figure now has no text and the image renders to identical pixels everywhere, so the default tolerance passes on both platforms. Baseline regenerated from CI. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pling Even after dropping titles/tight_layout, Linux vs macOS still differed by RMS ~28, localized to a row band - the signature of nearest-neighbour resampling when the 100px zoom is upscaled to the axes height: the two matplotlib versions round the source row differently at the boundary. (Confirmed it was rendering, not the algorithm: the "before" panel, which has no stitching, differed *more* than the "after" panel.) Build the before/after panels as numpy arrays, draw the dashed seam into the array, and imshow on a full-figure axis sized exactly to the data (figsize * DPI == array shape). 1:1 nearest with no text and no line AA renders identically on every platform/matplotlib version, so the default tolerance passes with no per-test override. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Linux render from the ubuntu CI artifact (204x100, 1:1). Cross-platform delta to the macOS render is RMS ~30 < the default tolerance 50 (down from the original 53 once the tight_layout/title layout shift was removed); the residual is matplotlib Agg cross-version edge rasterization, not algorithm (verified: the no-stitching 'before' panel differs as much, and no integer pixel shift reconciles them).
The min_confidence early-prune is behavior-preserving, so the existing public tests already lock the result; a dedicated test that monkeypatches a private function and hand-builds _CutEdge objects is exactly the internal coupling this suite was trimmed away from. _max_achievable_score (sharing _score_pair_features) makes the prune's soundness self-evident without it.
What
Second PR of the 3-way split of #1170. Adds
sq.experimental.tl.assign_stitch_groups(+StitchParams): given theis_outlier=Truecells flagged bycalculate_tiling_qc(PR-A, #1188), it pairs facing cut edges across tile boundaries, scores each candidate pair, and assembles high-confidence pairs into stitch groups via union-find.It only annotates
.obs(four columns) plus a.uns["tiling_stitch"]audit block; the labels element is never modified. Materialising a stitched labels element is PR-C (make_stitched_labels, follows this one).The score: a transparent flat mean, not a fitted model
@selmanozleyen asked to review the math/intuition of
stitch_confidenceand whether it is a novel/fitted formula. Short answer: it is not fitted. An earlier prototype on this branch (ba60119f) shipped an L2-regularised logistic regression with coefficients fit on 2197 synthetic disk pairs. We removed it deliberately — fixture-fit weights claim a calibration they cannot honour on real data, and squidpy should not silently encode a synthetic distribution. The score is now the flat (unweighted) mean of five standard, bounded geometric descriptors, each in[0, 1]. The features are recorded in.uns["tiling_stitch"].For a facing cut-edge pair (each a 1-D chord
extent=(lo, hi)at perpendicularcoord):iou— 1-D extent IoU of the two chords:overlap / (len_a + len_b - overlap). Do the two cut faces line up laterally?endpoint_match—max(0, 1 - (|a_lo-b_lo| + |a_hi-b_hi|) / max(len_a, len_b)). Do the faces start/end at the same lateral positions?merge_compactness—min(4*pi*A / P^2, 1)of the closed union mask's largest component. A re-joined real cell is compact; a false merge is elongated. Strongest single discriminator.merge_solidity—area / convex_hull_areaof the same component (clamped to 1). A false merge creates a concave waist at the join.gap_proximity—clip(1 - gap / (2*close_radius), 0, 1). The seam gap relative to the closing reach (the scale at which morphological closing could actually bridge it) — independent of themax_gapsearch radius; neutral (1.0) when closing is disabled.stitch_confidence(pair) = mean(the five features). Per-cell group confidence is theminover the group's pairwise confidences (weakest link).Runnable MREs (pure numpy/scipy/skimage, no squidpy)
The 1-D edge features:
The 2-D merged-shape features (error handling elided):
Validation (sweep, not fit)
Weights are fixed by reasoning (flat-equal); the ground-truth fixture only validates them. Sweeping
min_confidenceover 3 synthetic layouts (recall = fraction of cut pieces stitched; precision = cut / (cut + intact-false-merge)):(The recall ceiling is < 1 because not every cut piece has a stitchable partner.) Default
min_confidence=0.7gives full attainable recall with zero false merges and headroom before the high-threshold drop-off. Users should still tune it for their data — the score is heuristic, not a calibrated probability.Obs / uns contract (consumed additively by PR-C)
.obs:stitch_group_id(int),is_stitched(bool),n_pieces(int32),stitch_confidence(float64;NaN= not an outlier,1.0= evaluated-solo, composite = stitched)..uns["tiling_stitch"]: params,score_features, and the run counts.PR-C's
make_stitched_labelsreads only this contract (it does not import_tiling_stitch).Notes for review
_tiling_stitch.py(defensive branches in the geometry helpers); the algorithm paths are covered bytests/experimental/test_tiling_stitch.py.hatch-test.py3.13-pre: the pre-release-deps job is a known upstream flake (pre-existing on Add tile-cut stitching follow-up to calculate_tiling_qc #1170/[1/3] Bundle TilingQCParams; add resolve_labels_array helper #1188), not caused by this PR.TestStitchVisualrenders a seam recolour (before bylabel_id, after bystitch_group_id); its baseline PNG is bootstrapped from the first CI run's artifact (not generated locally).resolve_params,equivalent_diameter+largest_contour,iter_chunked_regionprops) intoexperimental/utils/, reused by_tiling_qc/_tilingto remove duplication. QC numerics are unchanged — the committed QC/tiling visual baselines pass.Out of scope
make_stitched_labels+ materialisation — PR-C (next, stacks on this).