[MRG] Fixes #441: expose PAA segment_indices()#671
Open
jbbqqf wants to merge 1 commit into
Open
Conversation
Expose the start/end indices of each PAA segment so callers can map a PAA value back to the original-time-series range it summarises. The indices match the slicing convention already used in ``PiecewiseAggregateApproximation._transform`` (segment width ``sz_fit // n_segments`` with trailing samples dropped), so ``paa_data[i_seg]`` equals ``ts[start_i:end_i].mean()`` by construction. A regression test in ``tests/test_piecewise.py`` checks that contract on both a divisible and a non-divisible series length.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #671 +/- ##
==========================================
+ Coverage 93.70% 93.72% +0.01%
==========================================
Files 73 73
Lines 6986 7008 +22
==========================================
+ Hits 6546 6568 +22
Misses 440 440 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
|
I believe #441 was about retrieving the indices induced by a particular dataset rather than the fitted one. In the case I'm wrong, and indices for the fitted dataset are required, an attribute with sklearn syntax (something like indices_) is probably a better fit. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
PiecewiseAggregateApproximation.segment_indices(), a small accessor that returns the(start, end)indices each PAA segment summarises in the original-length time series. This lets users locate "where the PAA value changes" — the question raised in #441 — without re-deriving the segment-width logic from_transform.Fixes #441 — Index of changes in PAA and SAX.
Context
Issue #441 asks for a way to map a
paa_data[i]value back to the original-series index range it averages. The information already exists implicitly inside_transform(sz_segment = sz_fit // n_segments; segmenticoversts[i*sz_segment : (i+1)*sz_segment]), but callers had to re-derive it. Exposing it as a method:sz_fit // n_segmentsformula),paa_data[i] == ts[start_i:end_i].mean()invariant testable,transform/inverse_transform.Changes
tslearn/piecewise/piecewise.py— newsegment_indices()method onPiecewiseAggregateApproximation. Returnsnp.ndarrayof shape(n_segments, 2)with half-open intervals[start, end)matching_transform. RaisesNotFittedErrorbeforefit. A short inline comment notes that the segment-width formula must stay in sync with_transformso thatpaa_data[i_seg] == ts[start_i:end_i].mean()holds.tests/test_piecewise.py— regression testtest_paa_segment_indicesthat:NotFittedErrorbefore fitting,paa_datamatchts[start:end].mean()cell-by-cell,CHANGELOG.md— entry under[Towards v0.9.0] / Added.The new method is purely additive — no existing PAA / SAX behaviour changes.
Reproduce BEFORE/AFTER yourself (copy-paste)
What I ran locally
pytest tests/test_piecewise.py -v→ 5 passed (the existing 4 PAA/SAX tests + the new one).pytest --doctest-modules tslearn/piecewise/piecewise.py→ 7 passed (covers the new docstring example).Edge cases tested
PiecewiseAggregateApproximation(n_segments=3).segment_indices()NotFittedErrortest_paa_segment_indicessz=6, n_segments=3on[-1, 2, 0.1, -1, 1, -1][[0,2],[2,4],[4,6]]andpaa_data[i] == ts[start:end].mean()test_paa_segment_indices(loop assertion)sz=7, n_segments=3[[0,2],[2,4],[4,6]](trailing sample dropped, matching_transform)test_paa_segment_indices(non-divisible block)Risk / blast radius
Additive only: a new method on an existing class. No changes to
fit,transform,inverse_transform,distance, ordistance_paa. Cannot break existing callers; cannot affect serialisation (no new attribute is stored on the estimator).Release note
PR drafted with assistance from Claude Code. The change was reviewed manually against tslearn-team/tslearn's source (
tslearn/piecewise/piecewise.py:147_transform) and the reproducer block above was used during development; it is the same one a reviewer can paste verbatim.