-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Sage API parity — frames, converters, config #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
ab9e944
Add design spec for Sage API parity in timsrust_cpp_bridge
timosachsenberg bb53a62
Address spec review: fix types, sentinels, add missing fields
timosachsenberg eb5e0aa
Address second spec review: clarify types, derivations, memory
timosachsenberg 9f6f1d3
Add implementation plan for Sage API parity
timosachsenberg 072a013
Address Codex review: sync header per-chunk, fix null ptrs, propagate…
timosachsenberg 7209494
feat: extend TimsFfiSpectrum with index, isolation, charge, precursor…
timosachsenberg b10b065
feat: add TimsFfiFrame type, FrameReader, and converters to TimsDataset
timosachsenberg 07b1616
feat: add tims_get_frame, tims_get_frames_by_level, tims_free_frame_a…
timosachsenberg ccb7375
Fix Chunk 2 code quality issues: ms_level type, struct docs, formatting
timosachsenberg e439d1f
Add error messages and improve formatting in get_frame
timosachsenberg 898a57e
feat: add tims_convert_tof_to_mz and tims_convert_scan_to_im FFI exports
timosachsenberg ee81eae
feat: add config builder, tims_open_with_config, and C header updates
timosachsenberg 2e77803
feat: update C++ example with frame, converter, and extended spectrum…
timosachsenberg ef2dcb2
Fix header comment: filename and frame buffer invalidation scope
timosachsenberg a4a0af4
Address CodeRabbit review: error codes, stale errors, overflow checks
timosachsenberg d6ff05e
docs: add testing strategy design spec
timosachsenberg a57dc0b
fix: null-check and finalize bugs found during spec review
timosachsenberg 9259011
docs: add testing strategy implementation plan
timosachsenberg 0e03cca
feat: add test infrastructure (Cargo config, public modules, test hel…
timosachsenberg 18ed983
test: add FFI lifecycle tests for open/close, config builder, and ope…
timosachsenberg 0b2a245
test: add error handling, spectrum, and frame FFI tests
timosachsenberg 024a651
test: add query/metadata and converter FFI tests
timosachsenberg 0ce8582
test: add real data integration tests for DDA and DIA datasets
timosachsenberg 48fa1d9
test: add C++ Catch2 test suite for ABI and smoke tests
timosachsenberg c695a85
ci: add GitHub Actions CI workflow for stub and integration tests
timosachsenberg 36928c2
fix: gate stub-only tests with #[cfg] and serialize global error tests
timosachsenberg e37726c
ci: configure integration test dataset download from release artifacts
timosachsenberg 5cfb010
ci: run integration tests on all PRs with cached datasets
timosachsenberg a17c7cd
docs: add testing section to README
timosachsenberg db6a644
fix: make stub converter methods public and force-link rlib in tests
timosachsenberg 58db9bf
fix: make stub types pub(crate) and handle timsrust config panics in …
timosachsenberg 3f97a15
fix: use libc::c_char for error buffers (ARM portability)
timosachsenberg 689bfbd
chore: remove superpowers spec documents
timosachsenberg 3ec0c71
docs: add OpenMS integration plan for Bruker TDF support
timosachsenberg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
1,398 changes: 1,398 additions & 0 deletions
1,398
docs/superpowers/plans/2026-03-11-sage-api-parity.md
Large diffs are not rendered by default.
Oops, something went wrong.
208 changes: 208 additions & 0 deletions
208
docs/superpowers/specs/2026-03-11-sage-api-parity-design.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,208 @@ | ||
| # Sage API Parity — Design Spec | ||
|
|
||
| Expose all timsrust functionality that [Sage](https://github.com/lazear/sage) uses through the timsrust_cpp_bridge FFI. | ||
|
|
||
| ## Context | ||
|
|
||
| Sage (sage-cloudpath crate) uses timsrust for: | ||
| - **SpectrumReader** with configurable `SpectrumReaderConfig` (processing params, DIA frame splitting) | ||
| - **FrameReader** for raw MS1 frame-level access (tof_indices, intensities, scan_offsets) | ||
| - **MetadataReader** for `Tof2MzConverter` and `Scan2ImConverter` (raw index → physical value conversion) | ||
| - **Spectrum** fields not yet exposed: `isolation_width`, `index`, `precursor.charge`, `precursor.intensity`, `precursor.frame_index` | ||
|
|
||
| The current bridge exposes spectrum-level access only, with no frame-level access, no converters, and no configurable reader construction. | ||
|
|
||
| ## Design | ||
|
|
||
| ### 1. Extended `TimsFfiSpectrum` | ||
|
|
||
| Append new fields to the existing struct (no existing field offsets change): | ||
|
|
||
| ```c | ||
| typedef struct tims_spectrum { | ||
| // existing | ||
| double rt_seconds; | ||
| double precursor_mz; | ||
| uint8_t ms_level; | ||
| uint32_t num_peaks; | ||
| float *mz; | ||
| float *intensity; | ||
| double im; | ||
| // new | ||
| uint32_t index; // spectrum index from SpectrumReader (Spectrum.index) | ||
| double isolation_width; // isolation window width (0.0 if N/A) | ||
| double isolation_mz; // isolation window center m/z (0.0 if N/A) | ||
| uint8_t charge; // precursor charge (0 = unknown) | ||
| double precursor_intensity; // precursor intensity (NaN = unknown) | ||
| uint32_t frame_index; // precursor's frame index (UINT32_MAX if N/A) | ||
| } tims_spectrum; | ||
| ``` | ||
|
|
||
| Sentinel values for optional fields: `0` for charge, `UINT32_MAX` for frame_index (emitted when the spectrum has no precursor, i.e. MS1), `NaN` for precursor_intensity, `0.0` for isolation_width/isolation_mz. Keeps the struct flat and C-friendly. | ||
|
|
||
| Notes: | ||
| - `precursor.charge` is `Option<usize>` in timsrust — the `usize → u8` cast is safe since charge values are always small (1–6 in practice). | ||
| - `precursor.intensity` is `Option<f64>`, preserved as `double` to avoid precision loss. | ||
| - `precursor.frame_index` is a plain `usize` (not optional) in timsrust — the `UINT32_MAX` sentinel applies only to MS1 spectra where no `Precursor` exists. | ||
|
|
||
| ### 2. Frame-Level Access | ||
|
|
||
| #### New type: `TimsFfiFrame` | ||
|
|
||
| ```c | ||
| typedef struct tims_frame { | ||
| uint32_t index; // frame index | ||
| double rt_seconds; // retention time | ||
| uint8_t ms_level; // 1=MS1, 2=MS2, 0=Unknown | ||
| uint32_t num_scans; // number of scans (derived as frame.scan_offsets.len() - 1) | ||
| uint32_t num_peaks; // total peaks (length of tof_indices & intensities) | ||
| uint32_t *tof_indices; // raw TOF indices, flat array | ||
| uint32_t *intensities; // raw intensities, flat array | ||
| uint64_t *scan_offsets; // per-scan offsets into flat arrays (length: num_scans + 1) | ||
| } tims_frame; | ||
| ``` | ||
|
|
||
| Raw indices are preserved (not converted to m/z) so callers can perform efficient discrete-domain operations like binning/summing on TOF indices before converting. | ||
|
|
||
| Implementation notes: | ||
| - `scan_offsets` is `Vec<usize>` in timsrust. The bridge copies to `Vec<u64>` for a stable 64-bit ABI. 32-bit targets are not supported. | ||
| - `ms_level` maps from timsrust's `MSLevel` enum: `MS1 → 1`, `MS2 → 2`, `Unknown → 0`. | ||
|
|
||
| #### New functions | ||
|
|
||
| **Single-frame access (handle-owned buffers):** | ||
| ```c | ||
| tims_status tims_get_frame(tims_dataset *ds, uint32_t index, tims_frame *out); | ||
| ``` | ||
| Buffers are owned by the dataset handle, valid until the next call to `tims_get_frame` on that handle. Frame and spectrum buffers are independent — calling `tims_get_spectrum` does not invalidate frame buffers and vice versa. | ||
|
|
||
| **Batch filtered access (caller-owned, malloc'd):** | ||
| ```c | ||
| tims_status tims_get_frames_by_level( | ||
| tims_dataset *ds, | ||
| uint8_t ms_level, | ||
| tims_frame **out_frames, | ||
| uint32_t *out_count | ||
| ); | ||
| void tims_free_frame_array(tims_dataset *ds, tims_frame *frames, uint32_t count); | ||
| ``` | ||
| Uses `FrameReader::get_all_ms1()` / `get_all_ms2()` on the Rust side (internally parallel). Invalid `ms_level` values (anything other than 1 or 2) return an empty array with `out_count = 0` and `Ok` status. | ||
|
|
||
| `tims_free_frame_array` frees per-frame `tof_indices`, `intensities`, and `scan_offsets` arrays, then the frame array itself. | ||
|
|
||
| ### 3. Converters | ||
|
|
||
| Methods on the dataset handle. `MetadataReader::new()` is called at open time, and the returned `Metadata`'s converters (`mz_converter`, `im_converter`) are cached inside `TimsDataset`. | ||
|
|
||
| **Single-value conversion:** | ||
| ```c | ||
| double tims_convert_tof_to_mz(tims_dataset *ds, uint32_t tof_index); | ||
| double tims_convert_scan_to_im(tims_dataset *ds, uint32_t scan_index); | ||
| ``` | ||
|
|
||
| **Batch conversion (caller-provided output buffer):** | ||
| ```c | ||
| tims_status tims_convert_tof_to_mz_array( | ||
| tims_dataset *ds, | ||
| const uint32_t *tof_indices, uint32_t count, | ||
| double *out_mz | ||
| ); | ||
| tims_status tims_convert_scan_to_im_array( | ||
| tims_dataset *ds, | ||
| const uint32_t *scan_indices, uint32_t count, | ||
| double *out_im | ||
| ); | ||
| ``` | ||
|
|
||
| Batch versions take caller-provided output buffers (no malloc — caller knows the size). Single-value versions return the result directly (converter is always valid once dataset is open). Returns `NaN` if handle is NULL. | ||
|
|
||
| ### 4. Configurable Reader Construction | ||
|
|
||
| **Opaque config builder:** | ||
| ```c | ||
| typedef struct tims_config tims_config; | ||
|
|
||
| tims_config *tims_config_create(void); | ||
| void tims_config_free(tims_config *cfg); | ||
|
|
||
| // SpectrumProcessingParams setters | ||
| void tims_config_set_smoothing_window(tims_config *cfg, uint32_t window); | ||
| void tims_config_set_centroiding_window(tims_config *cfg, uint32_t window); | ||
| void tims_config_set_calibration_tolerance(tims_config *cfg, double tolerance); | ||
| void tims_config_set_calibrate(tims_config *cfg, uint8_t enabled); // 0 = disabled, non-zero = enabled | ||
|
|
||
| // FrameWindowSplittingConfiguration setters | ||
| // (exact setters TBD — will be finalized during implementation by inspecting | ||
| // timsrust 0.4.2's FrameWindowSplittingConfiguration fields. Note: the | ||
| // UniformMobility variant takes an Option<Scan2ImConverter>, which may require | ||
| // opening the dataset first to obtain the converter — this chicken-and-egg | ||
| // constraint may limit which DIA splitting modes are configurable pre-open.) | ||
|
|
||
| // Open with config (existing tims_open remains for default config) | ||
| tims_status tims_open_with_config( | ||
| const char *path, | ||
| const tims_config *cfg, | ||
| tims_dataset **out | ||
| ); | ||
| ``` | ||
|
|
||
| Current `tims_open()` is unchanged and continues to use timsrust defaults. | ||
|
|
||
| ## Rust-Side Architecture | ||
|
|
||
| ### Changes to `TimsDataset` (dataset.rs) | ||
|
|
||
| - Add `frame_reader: FrameReader` — constructed at open time alongside `SpectrumReader` | ||
| - Add `mz_converter: Tof2MzConverter` and `im_converter: Scan2ImConverter` — from `MetadataReader::new()` at open time | ||
| - Add frame buffers: `tof_buf: Vec<u32>`, `int_buf_u32: Vec<u32>`, `scan_offset_buf: Vec<u64>` for single-frame handle-owned access | ||
| - Populate new `TimsFfiSpectrum` fields in `get_spectrum()` and `tims_get_spectra_by_rt()` | ||
| - `num_frames` field can be replaced by `frame_reader.len()` | ||
|
|
||
| ### New file: `config.rs` | ||
|
|
||
| - `TimsFfiConfig` wrapper around `SpectrumReaderConfig` | ||
| - Setter methods mapping to individual config fields | ||
| - Used by `tims_open_with_config()` to build the `SpectrumReader` | ||
|
|
||
| ### Changes to `types.rs` | ||
|
|
||
| - Add `TimsFfiFrame` repr(C) struct | ||
| - Extend `TimsFfiSpectrum` with new fields | ||
|
|
||
| ### Changes to `lib.rs` | ||
|
|
||
| - New FFI exports for all new functions | ||
| - `tims_open_with_config()` passes `SpectrumReaderConfig` (including `FrameWindowSplittingConfiguration`) to the builder via `with_config()`. The builder internally resolves converter dependencies during `finalize()`. | ||
| - Frame functions delegate to `TimsDataset` methods | ||
| - Converter functions delegate to cached converters | ||
|
|
||
| ### Stub mode (without `with_timsrust`) | ||
|
|
||
| All new functions get stub implementations: | ||
| - Frame functions return empty frames / zero counts | ||
| - Converters return identity (input cast to f64) | ||
| - Config functions create/free a dummy struct | ||
| - `tims_open_with_config` ignores config, behaves like `tims_open` | ||
|
|
||
| ## New Function Summary | ||
|
|
||
| | Function | Category | Memory | | ||
| |---|---|---| | ||
| | `tims_get_frame` | Frame: single | Handle-owned | | ||
| | `tims_get_frames_by_level` | Frame: batch | Caller-owned (malloc) | | ||
| | `tims_free_frame_array` | Frame: cleanup | — | | ||
| | `tims_convert_tof_to_mz` | Converter: single | Return value | | ||
| | `tims_convert_scan_to_im` | Converter: single | Return value | | ||
| | `tims_convert_tof_to_mz_array` | Converter: batch | Caller-provided buffer | | ||
| | `tims_convert_scan_to_im_array` | Converter: batch | Caller-provided buffer | | ||
| | `tims_config_create` | Config: lifecycle | Returns Box'd | | ||
| | `tims_config_free` | Config: lifecycle | — | | ||
| | `tims_config_set_*` | Config: setters | — | | ||
| | `tims_open_with_config` | Config: open | — | | ||
|
|
||
| ## Non-Goals | ||
|
|
||
| - No new error codes (existing `TimsFfiStatus` values suffice) | ||
| - No DIA-specific API (DDA/DIA handled uniformly through SpectrumReaderConfig) | ||
| - No thread-safety changes (same single-handle-single-thread model) | ||
| - No changes to existing function signatures or behavior | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.