feat: parallelize per-library spatial graph construction#1144
feat: parallelize per-library spatial graph construction#1144Marius1311 wants to merge 1 commit into
Conversation
Add `n_jobs` parameter to `spatial_neighbors()` to compute per-library graphs in parallel via joblib. Defaults to 1 (sequential, no behavior change). Set to -1 to use all CPUs. When `library_key` is set, each library's graph is already computed independently, so this is a trivially parallel workload. For datasets with many libraries (e.g., multi-sample spatial transcriptomics), this gives a near-linear speedup.
89f8be4 to
5eda868
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1144 +/- ##
==========================================
+ Coverage 74.05% 74.08% +0.03%
==========================================
Files 39 39
Lines 6495 6503 +8
Branches 1122 1122
==========================================
+ Hits 4810 4818 +8
Misses 1230 1230
Partials 455 455
🚀 New features to boost your workflow:
|
|
Thank you but we are working on making updates to |
|
Hi @Marius1311, #1147 has been merged. squidpy/src/squidpy/gr/_build.py Lines 793 to 799 in da789d0 Do you want to work on this? |
|
thanks @grst! In my own tests, my parallelization actually didn't lead to any speed improvements, not sure why! |
|
So we can close this if you like. |
|
I observed something similar, and I think a large part of the runtime is spent on subsetting the spatialdata/anndata object before any nearest neighbor code is even triggered. I still think it should be possible to optimize this, but then it's likely not a trivial fix as thought initially. |
|
Closing in favor of #1198 |
Add
n_jobsparameter tospatial_neighbors()to compute per-library graphs in parallel via joblib. Defaults to 1 (sequential, no behavior change). Set to -1 to use all CPUs.When
library_keyis set, each library's graph is already computed independently, so this is a trivially parallel workload. For datasets with many libraries (e.g., multi-sample spatial transcriptomics), this gives a near-linear speedup.IMPORTANT: Please search among the Pull requests before creating one.
Description
How has this been tested?
Closes