Open
Conversation
Split the flag-driven _correlate_internal into three focused functions (_predict_with_observed, _extract_observations, _extract_training_data) with shared validation logic, improving readability and traceability of each code path. - Remove pass-through __init__ methods from Pydantic models, moving param docs to class docstrings (Spectrum, ProcessingResult, ProteomeSearchSpace, ModificationConfig, _PeptidoformSearchSpace) - Extract _annotations_to_tuples helper for repeated FragmentAnnotation destructuring in _spectrum_processing.py - Add Spectrum.inverse_log2_transform() as the canonical inverse of log2_transform(), replacing duplicated inline expressions - Remove stray debugpy import in xgb_models.py - CLI: Make logging level case insensitive
Update the Python spectrum-processing flow to consume AnnotatedMS2Spectrum and Rust target extraction directly, refresh the related tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR simplifies the Python/Rust boundary for annotation and target extraction, and keeps the correlate flow working with both raw and pre-annotated preloaded spectra. Final performance improved substantially after the Rust-side follow-up.
Changelog
Added
ms2pip_extract_targets(...)in the correlate pipeline.AnnotatedMS2Spectrumobjects directly through the preloaded-spectrum path.Changed
ms2pip_compute_theoretical_mz(...)results directly from Rust without wrapping them again innp.array(...).annotate_ms2_spectra(...)no longer receivesseq_lens.annotate_spectrum(...)to returnAnnotatedMS2Spectruminstead of Python tuple-converted annotations.MatchedSpectrumto storeannotated_spectruminstead ofpeak_annotations._validate_and_extract_targets(...)to batch target extraction through a single Rust call.correlate_single(...)to use Rust target extraction as well.tests/test_spectrum_processing.pyto validate Rust target extraction directly usingAnnotatedMS2Spectrum/FragmentAnnotation.ms2rescore-rsdependency inpyproject.tomlto>=0.5.0a3,<2.Removed
_annotations_to_tuples(...)from the active annotation/target-extraction flow.targets_from_annotations(...)from the active correlate flow.Fixed