Skip to content
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
ab9e944
Add design spec for Sage API parity in timsrust_cpp_bridge
timosachsenberg Mar 11, 2026
bb53a62
Address spec review: fix types, sentinels, add missing fields
timosachsenberg Mar 11, 2026
eb5e0aa
Address second spec review: clarify types, derivations, memory
timosachsenberg Mar 11, 2026
9f6f1d3
Add implementation plan for Sage API parity
timosachsenberg Mar 11, 2026
072a013
Address Codex review: sync header per-chunk, fix null ptrs, propagate…
timosachsenberg Mar 11, 2026
7209494
feat: extend TimsFfiSpectrum with index, isolation, charge, precursor…
timosachsenberg Mar 11, 2026
b10b065
feat: add TimsFfiFrame type, FrameReader, and converters to TimsDataset
timosachsenberg Mar 11, 2026
07b1616
feat: add tims_get_frame, tims_get_frames_by_level, tims_free_frame_a…
timosachsenberg Mar 11, 2026
ccb7375
Fix Chunk 2 code quality issues: ms_level type, struct docs, formatting
timosachsenberg Mar 11, 2026
e439d1f
Add error messages and improve formatting in get_frame
timosachsenberg Mar 11, 2026
898a57e
feat: add tims_convert_tof_to_mz and tims_convert_scan_to_im FFI exports
timosachsenberg Mar 11, 2026
ee81eae
feat: add config builder, tims_open_with_config, and C header updates
timosachsenberg Mar 11, 2026
2e77803
feat: update C++ example with frame, converter, and extended spectrum…
timosachsenberg Mar 11, 2026
ef2dcb2
Fix header comment: filename and frame buffer invalidation scope
timosachsenberg Mar 11, 2026
a4a0af4
Address CodeRabbit review: error codes, stale errors, overflow checks
timosachsenberg Mar 12, 2026
d6ff05e
docs: add testing strategy design spec
timosachsenberg Mar 12, 2026
a57dc0b
fix: null-check and finalize bugs found during spec review
timosachsenberg Mar 12, 2026
9259011
docs: add testing strategy implementation plan
timosachsenberg Mar 12, 2026
0e03cca
feat: add test infrastructure (Cargo config, public modules, test hel…
timosachsenberg Mar 12, 2026
18ed983
test: add FFI lifecycle tests for open/close, config builder, and ope…
timosachsenberg Mar 12, 2026
0b2a245
test: add error handling, spectrum, and frame FFI tests
timosachsenberg Mar 12, 2026
024a651
test: add query/metadata and converter FFI tests
timosachsenberg Mar 12, 2026
0ce8582
test: add real data integration tests for DDA and DIA datasets
timosachsenberg Mar 12, 2026
48fa1d9
test: add C++ Catch2 test suite for ABI and smoke tests
timosachsenberg Mar 12, 2026
c695a85
ci: add GitHub Actions CI workflow for stub and integration tests
timosachsenberg Mar 12, 2026
36928c2
fix: gate stub-only tests with #[cfg] and serialize global error tests
timosachsenberg Mar 12, 2026
e37726c
ci: configure integration test dataset download from release artifacts
timosachsenberg Mar 12, 2026
5cfb010
ci: run integration tests on all PRs with cached datasets
timosachsenberg Mar 12, 2026
a17c7cd
docs: add testing section to README
timosachsenberg Mar 12, 2026
db6a644
fix: make stub converter methods public and force-link rlib in tests
timosachsenberg Mar 12, 2026
58db9bf
fix: make stub types pub(crate) and handle timsrust config panics in …
timosachsenberg Mar 12, 2026
3f97a15
fix: use libc::c_char for error buffers (ARM portability)
timosachsenberg Mar 12, 2026
689bfbd
chore: remove superpowers spec documents
timosachsenberg Mar 12, 2026
3ec0c71
docs: add OpenMS integration plan for Bruker TDF support
timosachsenberg Mar 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,398 changes: 1,398 additions & 0 deletions docs/superpowers/plans/2026-03-11-sage-api-parity.md

Large diffs are not rendered by default.

208 changes: 208 additions & 0 deletions docs/superpowers/specs/2026-03-11-sage-api-parity-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
# Sage API Parity — Design Spec

Expose all timsrust functionality that [Sage](https://github.com/lazear/sage) uses through the timsrust_cpp_bridge FFI.

## Context

Sage (sage-cloudpath crate) uses timsrust for:
- **SpectrumReader** with configurable `SpectrumReaderConfig` (processing params, DIA frame splitting)
- **FrameReader** for raw MS1 frame-level access (tof_indices, intensities, scan_offsets)
- **MetadataReader** for `Tof2MzConverter` and `Scan2ImConverter` (raw index → physical value conversion)
- **Spectrum** fields not yet exposed: `isolation_width`, `index`, `precursor.charge`, `precursor.intensity`, `precursor.frame_index`

The current bridge exposes spectrum-level access only, with no frame-level access, no converters, and no configurable reader construction.

## Design

### 1. Extended `TimsFfiSpectrum`

Append new fields to the existing struct (no existing field offsets change):

```c
typedef struct tims_spectrum {
// existing
double rt_seconds;
double precursor_mz;
uint8_t ms_level;
uint32_t num_peaks;
float *mz;
float *intensity;
double im;
// new
uint32_t index; // spectrum index from SpectrumReader (Spectrum.index)
double isolation_width; // isolation window width (0.0 if N/A)
double isolation_mz; // isolation window center m/z (0.0 if N/A)
uint8_t charge; // precursor charge (0 = unknown)
double precursor_intensity; // precursor intensity (NaN = unknown)
uint32_t frame_index; // precursor's frame index (UINT32_MAX if N/A)
} tims_spectrum;
```

Sentinel values for optional fields: `0` for charge, `UINT32_MAX` for frame_index (emitted when the spectrum has no precursor, i.e. MS1), `NaN` for precursor_intensity, `0.0` for isolation_width/isolation_mz. Keeps the struct flat and C-friendly.

Notes:
- `precursor.charge` is `Option<usize>` in timsrust — the `usize → u8` cast is safe since charge values are always small (1–6 in practice).
- `precursor.intensity` is `Option<f64>`, preserved as `double` to avoid precision loss.
- `precursor.frame_index` is a plain `usize` (not optional) in timsrust — the `UINT32_MAX` sentinel applies only to MS1 spectra where no `Precursor` exists.

### 2. Frame-Level Access

#### New type: `TimsFfiFrame`

```c
typedef struct tims_frame {
uint32_t index; // frame index
double rt_seconds; // retention time
uint8_t ms_level; // 1=MS1, 2=MS2, 0=Unknown
uint32_t num_scans; // number of scans (derived as frame.scan_offsets.len() - 1)
uint32_t num_peaks; // total peaks (length of tof_indices & intensities)
uint32_t *tof_indices; // raw TOF indices, flat array
uint32_t *intensities; // raw intensities, flat array
uint64_t *scan_offsets; // per-scan offsets into flat arrays (length: num_scans + 1)
} tims_frame;
```

Raw indices are preserved (not converted to m/z) so callers can perform efficient discrete-domain operations like binning/summing on TOF indices before converting.

Implementation notes:
- `scan_offsets` is `Vec<usize>` in timsrust. The bridge copies to `Vec<u64>` for a stable 64-bit ABI. 32-bit targets are not supported.
- `ms_level` maps from timsrust's `MSLevel` enum: `MS1 → 1`, `MS2 → 2`, `Unknown → 0`.

#### New functions

**Single-frame access (handle-owned buffers):**
```c
tims_status tims_get_frame(tims_dataset *ds, uint32_t index, tims_frame *out);
```
Buffers are owned by the dataset handle, valid until the next call to `tims_get_frame` on that handle. Frame and spectrum buffers are independent — calling `tims_get_spectrum` does not invalidate frame buffers and vice versa.

**Batch filtered access (caller-owned, malloc'd):**
```c
tims_status tims_get_frames_by_level(
tims_dataset *ds,
uint8_t ms_level,
tims_frame **out_frames,
uint32_t *out_count
);
void tims_free_frame_array(tims_dataset *ds, tims_frame *frames, uint32_t count);
```
Uses `FrameReader::get_all_ms1()` / `get_all_ms2()` on the Rust side (internally parallel). Invalid `ms_level` values (anything other than 1 or 2) return an empty array with `out_count = 0` and `Ok` status.
Comment thread
timosachsenberg marked this conversation as resolved.
Outdated

`tims_free_frame_array` frees per-frame `tof_indices`, `intensities`, and `scan_offsets` arrays, then the frame array itself.

### 3. Converters

Methods on the dataset handle. `MetadataReader::new()` is called at open time, and the returned `Metadata`'s converters (`mz_converter`, `im_converter`) are cached inside `TimsDataset`.

**Single-value conversion:**
```c
double tims_convert_tof_to_mz(tims_dataset *ds, uint32_t tof_index);
double tims_convert_scan_to_im(tims_dataset *ds, uint32_t scan_index);
```

**Batch conversion (caller-provided output buffer):**
```c
tims_status tims_convert_tof_to_mz_array(
tims_dataset *ds,
const uint32_t *tof_indices, uint32_t count,
double *out_mz
);
tims_status tims_convert_scan_to_im_array(
tims_dataset *ds,
const uint32_t *scan_indices, uint32_t count,
double *out_im
);
```

Batch versions take caller-provided output buffers (no malloc — caller knows the size). Single-value versions return the result directly (converter is always valid once dataset is open). Returns `NaN` if handle is NULL.

### 4. Configurable Reader Construction

**Opaque config builder:**
```c
typedef struct tims_config tims_config;

tims_config *tims_config_create(void);
void tims_config_free(tims_config *cfg);

// SpectrumProcessingParams setters
void tims_config_set_smoothing_window(tims_config *cfg, uint32_t window);
void tims_config_set_centroiding_window(tims_config *cfg, uint32_t window);
void tims_config_set_calibration_tolerance(tims_config *cfg, double tolerance);
void tims_config_set_calibrate(tims_config *cfg, uint8_t enabled); // 0 = disabled, non-zero = enabled

// FrameWindowSplittingConfiguration setters
// (exact setters TBD — will be finalized during implementation by inspecting
// timsrust 0.4.2's FrameWindowSplittingConfiguration fields. Note: the
// UniformMobility variant takes an Option<Scan2ImConverter>, which may require
// opening the dataset first to obtain the converter — this chicken-and-egg
// constraint may limit which DIA splitting modes are configurable pre-open.)

// Open with config (existing tims_open remains for default config)
tims_status tims_open_with_config(
const char *path,
const tims_config *cfg,
tims_dataset **out
);
```

Current `tims_open()` is unchanged and continues to use timsrust defaults.

## Rust-Side Architecture

### Changes to `TimsDataset` (dataset.rs)

- Add `frame_reader: FrameReader` — constructed at open time alongside `SpectrumReader`
- Add `mz_converter: Tof2MzConverter` and `im_converter: Scan2ImConverter` — from `MetadataReader::new()` at open time
- Add frame buffers: `tof_buf: Vec<u32>`, `int_buf_u32: Vec<u32>`, `scan_offset_buf: Vec<u64>` for single-frame handle-owned access
- Populate new `TimsFfiSpectrum` fields in `get_spectrum()` and `tims_get_spectra_by_rt()`
- `num_frames` field can be replaced by `frame_reader.len()`

### New file: `config.rs`

- `TimsFfiConfig` wrapper around `SpectrumReaderConfig`
- Setter methods mapping to individual config fields
- Used by `tims_open_with_config()` to build the `SpectrumReader`

### Changes to `types.rs`

- Add `TimsFfiFrame` repr(C) struct
- Extend `TimsFfiSpectrum` with new fields

### Changes to `lib.rs`

- New FFI exports for all new functions
- `tims_open_with_config()` passes `SpectrumReaderConfig` (including `FrameWindowSplittingConfiguration`) to the builder via `with_config()`. The builder internally resolves converter dependencies during `finalize()`.
- Frame functions delegate to `TimsDataset` methods
- Converter functions delegate to cached converters

### Stub mode (without `with_timsrust`)

All new functions get stub implementations:
- Frame functions return empty frames / zero counts
- Converters return identity (input cast to f64)
- Config functions create/free a dummy struct
- `tims_open_with_config` ignores config, behaves like `tims_open`

## New Function Summary

| Function | Category | Memory |
|---|---|---|
| `tims_get_frame` | Frame: single | Handle-owned |
| `tims_get_frames_by_level` | Frame: batch | Caller-owned (malloc) |
| `tims_free_frame_array` | Frame: cleanup | — |
| `tims_convert_tof_to_mz` | Converter: single | Return value |
| `tims_convert_scan_to_im` | Converter: single | Return value |
| `tims_convert_tof_to_mz_array` | Converter: batch | Caller-provided buffer |
| `tims_convert_scan_to_im_array` | Converter: batch | Caller-provided buffer |
| `tims_config_create` | Config: lifecycle | Returns Box'd |
| `tims_config_free` | Config: lifecycle | — |
| `tims_config_set_*` | Config: setters | — |
| `tims_open_with_config` | Config: open | — |

## Non-Goals

- No new error codes (existing `TimsFfiStatus` values suffice)
- No DIA-specific API (DDA/DIA handled uniformly through SpectrumReaderConfig)
- No thread-safety changes (same single-handle-single-thread model)
- No changes to existing function signatures or behavior
52 changes: 52 additions & 0 deletions examples/cpp_client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,58 @@ int main(int argc, char** argv) {
std::cout << " file_info scan: " << info_ms << " ms (wall)\n";
std::cout << " (timsrust internal wall_ms: " << info.wall_ms << " ms)\n";

// ---- Frame-level access demo --------------------------------------------
std::cout << "\n-- Frame-level access --\n";
unsigned int total_frames = tims_num_frames(handle);
if (total_frames > 0) {
tims_frame frame{};
timsffi_status fs = tims_get_frame(handle, 0, &frame);
if (fs == TIMSFFI_OK) {
std::cout << "Frame 0: index=" << frame.index
<< " rt=" << std::fixed << std::setprecision(2) << frame.rt_seconds << "s"
<< " ms_level=" << (int)frame.ms_level
<< " scans=" << frame.num_scans
<< " peaks=" << frame.num_peaks << "\n";
}

// Batch: get all MS1 frames
tims_frame* ms1_frames = nullptr;
unsigned int ms1_count = 0;
auto t_ms1 = Clock::now();
tims_get_frames_by_level(handle, 1, &ms1_count, &ms1_frames);
double ms1_ms = elapsed_ms(t_ms1);
std::cout << "MS1 frames: " << ms1_count
<< " (fetched in " << std::setprecision(1) << ms1_ms << " ms)\n";
if (ms1_frames) tims_free_frame_array(handle, ms1_frames, ms1_count);
}

// ---- Converter demo -----------------------------------------------------
std::cout << "\n-- Converters --\n";
double mz_example = tims_convert_tof_to_mz(handle, 100000);
double im_example = tims_convert_scan_to_im(handle, 500);
std::cout << "TOF 100000 -> m/z " << std::setprecision(4) << mz_example << "\n";
std::cout << "Scan 500 -> IM " << std::setprecision(4) << im_example << "\n";

// ---- Extended spectrum fields demo --------------------------------------
std::cout << "\n-- Extended spectrum fields --\n";
if (tims_num_spectra(handle) > 0) {
tims_spectrum spec{};
if (tims_get_spectrum(handle, 0, &spec) == TIMSFFI_OK) {
std::cout << "Spectrum 0: index=" << spec.index
<< " ms_level=" << (int)spec.ms_level
<< " charge=" << (int)spec.charge
<< " isolation_width=" << std::setprecision(2) << spec.isolation_width
<< " isolation_mz=" << spec.isolation_mz
<< " frame_index=" << spec.frame_index
<< " precursor_intensity=";
if (std::isnan(spec.precursor_intensity))
std::cout << "N/A";
else
std::cout << std::setprecision(0) << spec.precursor_intensity;
std::cout << "\n";
}
}

tims_close(handle);
return 0;
}
84 changes: 83 additions & 1 deletion include/timsrust_cpp_bridge.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* include/timsffi.h */
/* include/timsrust_cpp_bridge.h */

#ifndef TIMSFFI_H
#define TIMSFFI_H
Expand Down Expand Up @@ -27,6 +27,13 @@ typedef struct {
const float* mz;
const float* intensity;
double im;
/* Sage-parity fields */
uint32_t index; /* spectrum index from SpectrumReader */
double isolation_width; /* isolation window width (0.0 if N/A) */
double isolation_mz; /* isolation window center m/z (0.0 if N/A) */
uint8_t charge; /* precursor charge (0 = unknown) */
double precursor_intensity; /* precursor intensity (NaN = unknown) */
uint32_t frame_index; /* precursor frame index (UINT32_MAX for MS1) */
} tims_spectrum;

typedef struct {
Expand All @@ -38,6 +45,17 @@ typedef struct {
uint8_t is_ms1;
} tims_swath_window;

typedef struct {
uint32_t index;
double rt_seconds;
uint8_t ms_level; /* 1=MS1, 2=MS2, 0=Unknown */
uint32_t num_scans;
uint32_t num_peaks; /* total peaks (length of tof_indices & intensities) */
const uint32_t* tof_indices; /* raw TOF indices, flat array */
const uint32_t* intensities; /* raw intensities, flat array */
const uint64_t* scan_offsets;/* per-scan offsets (length: num_scans + 1) */
} tims_frame;

/* functions: tims_open, tims_close, tims_num_spectra, tims_get_spectrum, ... */
/* Function prototypes (C ABI)
* Note: mz/intensity pointers returned from `tims_get_spectrum` currently
Expand Down Expand Up @@ -143,6 +161,70 @@ typedef struct {
*/
timsffi_status tims_file_info(tims_dataset* handle, tims_file_info_t* out);

/* -------------------------------------------------------------------------
* Frame-level access
* ------------------------------------------------------------------------- */

/* Fill out a frame structure for the given index. Returns status code.
* Pointers in the output point to internal buffers owned by the handle;
* valid until the next call to tims_get_frame on the same handle or
* tims_close(). Frame and spectrum buffers are independent.
*/
timsffi_status tims_get_frame(tims_dataset* handle, unsigned int index, tims_frame* out_frame);

/* Retrieve all frames at the given MS level (1 or 2). Returns an
* allocated array in *out_frames and sets *out_count. Caller must free
* with tims_free_frame_array(handle, frames, count). Invalid ms_level
* returns an empty array with TIMSFFI_OK.
*/
timsffi_status tims_get_frames_by_level(tims_dataset* handle, uint8_t ms_level, unsigned int* out_count, tims_frame** out_frames);

/* Free frames previously returned by tims_get_frames_by_level. Frees each
* per-frame tof_indices/intensities/scan_offsets buffer and then the array.
*/
void tims_free_frame_array(tims_dataset* handle, tims_frame* frames, unsigned int count);

/* -------------------------------------------------------------------------
* Index converters (TOF -> m/z, scan -> ion mobility)
* ------------------------------------------------------------------------- */

/* Convert a single TOF index to m/z. Returns NaN if handle is NULL. */
double tims_convert_tof_to_mz(const tims_dataset* handle, uint32_t tof_index);

/* Convert a single scan index to ion mobility (1/K0). Returns NaN if handle is NULL. */
double tims_convert_scan_to_im(const tims_dataset* handle, uint32_t scan_index);

/* Batch convert TOF indices to m/z. Caller provides output buffer. */
timsffi_status tims_convert_tof_to_mz_array(const tims_dataset* handle,
const uint32_t* tof_indices, uint32_t count,
double* out_mz);

/* Batch convert scan indices to ion mobility. Caller provides output buffer. */
timsffi_status tims_convert_scan_to_im_array(const tims_dataset* handle,
const uint32_t* scan_indices, uint32_t count,
double* out_im);

/* -------------------------------------------------------------------------
* Opaque configuration for SpectrumReader construction
* ------------------------------------------------------------------------- */

typedef struct tims_config tims_config;

/* Create a new config with default values. Caller must free with tims_config_free. */
tims_config *tims_config_create(void);

/* Free a config created by tims_config_create. */
void tims_config_free(tims_config *cfg);

/* SpectrumProcessingParams setters */
void tims_config_set_smoothing_window(tims_config *cfg, uint32_t window);
void tims_config_set_centroiding_window(tims_config *cfg, uint32_t window);
void tims_config_set_calibration_tolerance(tims_config *cfg, double tolerance);
void tims_config_set_calibrate(tims_config *cfg, uint8_t enabled); /* 0=off, non-zero=on */

/* Open dataset with custom config. Existing tims_open uses defaults. */
timsffi_status tims_open_with_config(const char* path, const tims_config* cfg, tims_dataset** out);


#ifdef __cplusplus
}
Expand Down
Loading
Loading