optimize: extend internalConcat dispatch for value-presented sources#2978
Draft
He-Pin wants to merge 3 commits into
Draft
optimize: extend internalConcat dispatch for value-presented sources#2978He-Pin wants to merge 3 commits into
He-Pin wants to merge 3 commits into
Conversation
…sented sources Motivation: FlattenMerge previously only fast-pathed `SingleSource`. For all other inner sources -- `Source.empty`, `Source(List)`, `Source.fromJavaStream`, `Source.future(Future.successful(...))`, range/iterator/repeat sources -- each one paid the cost of materializing a `SubSinkInlet` and `subFusingMaterializer.materialize(...)`. FlattenConcat already does this optimization via `TraversalBuilder.getValuePresentedSource`; FlattenMerge should benefit too, especially because the single-arg `flatMapConcat(f)` internally uses `FlattenMerge(1)` and so depends on FlattenMerge for its hot path. Modification: - Generalize FlattenMerge to dispatch on `getValuePresentedSource` (instead of `getSingleSource`) and consume `SingleSource`, `IterableSource`, `IteratorSource`, `RangeSource`, `RepeatSource`, `JavaStreamSource`, `FutureSource`, `FailedSource`, and empty sources in-place without materialization, mirroring FlattenConcat. - Add an `InflightSource[T]` family inside the new `FlattenMerge` companion to occupy a breadth slot for multi-element value-presented sources. Track them via a new `pendingInflightSources` counter so `activeSources` correctly bounds the breadth budget. - Preserve merge semantics: when an inflight source still has more elements after a push, re-enqueue it so other concurrent sources keep interleaving (instead of draining one source first, which is the concat behaviour). - Fold completed `Future`s and `FailedSource` directly: success pushes or queues a single element, failure calls `failStage`. - Pending `Future`s register a callback via `getAsyncCallback` and occupy a breadth slot until completion. - Empty inner sources are discarded in place (no slot consumed). Result: - `flatMapMerge(breadth, ...)` and the default `flatMapConcat(...)` (which routes through `FlattenMerge(1)`) skip substream materialization for value-presented inner sources, reducing per-source GC and stage overhead. - All existing FlattenMerge / flatMapConcat tests pass; new tests cover empty / single / iterable / range / java stream / completed and delayed future / failed inner sources across breadth = 1..128. - Internal API only (`@InternalApi private[pekko]`); MiMa is clean. References: - The optimization mirrors FlattenConcat's value-presented-source handling introduced in 1.2.0.
…enMerge Motivation: After the previous commit, FlattenMerge grew its own copy of the `InflightSource[T]` hierarchy (Iterator/Range/Repeat/CompletedFuture/ PendingFuture) duplicating what FlattenConcat already had in its companion object. Two near-identical families across two files is a maintenance hazard: any future tweak to the value-presented optimization (e.g. adding a new source type, fixing a Java-stream cleanup leak) would have to be mirrored, and the families had already drifted in small ways (e.g. `tryPull`/`cancel`/`materialize` declared abstract in concat with no-op overrides on every subclass; concat used `isClosed = true` for the completed-future variant while merge used `!_hasNext`). Modification: - Extract the common `InflightSource[T]` base and the five value-presented subclasses (Iterator/Range/Repeat/CompletedFuture/PendingFuture) into a new `pekko.stream.impl.fusing.InflightSources` package-private object. - Promote `tryPull` / `cancel` / `materialize` from abstract to concrete no-op defaults, so the value-presented subclasses no longer carry empty overrides. Stages that wrap a real `SubSinkInlet` (only FlattenConcat's `attachAndMaterializeSource` does this) override what they need. - Align `InflightCompletedFutureSource.isClosed` to FlattenConcat's `true` semantics — behaviorally equivalent in both stages, but more faithful to the source being a one-shot cached value. - Drop the `sealed` modifier on `InflightSource` so FlattenConcat's attached-substream anonymous subclass can still extend it from another file in the same package. - Remove the duplicate definitions from FlattenConcat's companion (now unused, drop the empty companion entirely) and from FlattenMerge's companion. Both stages import from the shared object instead. Result: - Net -176 lines of duplication; one canonical home for the optimization's data types. - Future additions (e.g. extending the optimization to other stream-of- streams stages such as `MergeMany`-style operators) only need to reference `InflightSources`. - All FlattenConcat / FlattenMerge / flatMapConcat parallelism tests remain green; MiMa is clean (`@InternalApi private[fusing]`).
Motivation:
`FlowOps#internalConcat` previously had only one fast-path: `SingleSource`
on the right-hand side was rerouted through the lightweight `SingleConcat`
stage instead of the general two-port `Concat[U](2, detached)` fan-in graph
(which materializes the whole substream plus a detacher buffer). All other
value-presented sources (`IterableSource`, `IteratorSource`, `RangeSource`,
`RepeatSource`, `JavaStreamSource`, `FutureSource`, `FailedSource`) still
took the heavy `concatGraph` path even though their data is already in
memory or trivially producible — the fan-in machinery and substream
materialization were pure overhead. Heavy `concat` users (pekko-http and
others) carry that cost on every materialization.
Modification:
Add four small specialized `GraphStage[FlowShape[E, E]]` siblings of
`SingleConcat`, each passing through elements while upstream is alive and
draining its captured value-presented payload on `onUpstreamFinish`:
- `IterableConcat[E](createIterator)` — emits via `emitMultiple`, covers
`IterableSource`, `IteratorSource`, `RangeSource`, `JavaStreamSource`.
- `RepeatConcat[E](elem)` — swaps `OutHandler` so each `onPull` pushes
`elem`, covers `RepeatSource`.
- `FailedConcat[E](failure)` — calls `failStage(failure)`, covers
`FailedSource`.
- `FutureConcat[E](future)` — emits/fails for completed futures, otherwise
swaps `OutHandler` (to avoid pulling the now-closed `in` port) and
registers an async callback that resolves once the future completes.
`internalConcat` is extended to dispatch via
`TraversalBuilder.getValuePresentedSource` and pattern-match the eight
value-presented source types (existing `SingleSource` path is preserved).
The `detached` flag is irrelevant for these stages — the right-hand data is
already present, so the one-element pre-fetch buffer that `detached=true`
provides has nothing to fetch (matching `SingleConcat`'s precedent).
Result:
For the eight value-presented source types, `concat` and `concatLazy` no
longer pay for substream materialization or the two-port fan-in graph.
Observable behavior is unchanged for all other sources, which still take the
existing `concatGraph` path. Eleven directional tests added to
`AbstractFlowConcatSpec` cover each new dispatch and assert (a) values
delivered correctly and (b) zero substream materialization for value-
presented sources. All `*FlowConcatSpec`, `*FlowConcatLazySpec`,
`*FlowConcatAllSpec`, `*FlowConcatAllLazySpec`, and `*GraphConcatSpec` pass.
MiMa is clean (all new stages are `private[pekko]` / `InternalApi`).
f01178f to
c5a4e58
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
FlowOps#internalConcatpreviously had only one fast-path: aSingleSourceon the right-hand side ofconcat/concatLazywas rerouted through the lightweightSingleConcatstage instead of the general two-portConcat[U](2, detached)fan-in graph (which materializes the whole substream plus a detacher buffer). All other value-presented sources (IterableSource,IteratorSource,RangeSource,RepeatSource,JavaStreamSource,FutureSource,FailedSource) still took the heavyconcatGraphpath even though their data is already in memory or trivially producible — the fan-in machinery and substream materialization were pure overhead. Heavyconcatusers (pekko-http and others) carry that cost on every materialization.This mirrors the optimization just shipped for
FlattenConcat/FlattenMergein #2977, applied to theconcatoperator chain.Modification
Add four small specialized
GraphStage[FlowShape[E, E]]siblings ofSingleConcat, each passing through elements while upstream is alive and draining its captured value-presented payload ononUpstreamFinish:IterableConcat[E](createIterator)IterableSource,IteratorSource,RangeSource,JavaStreamSourceemitMultiple(out, createIterator(), () => completeStage())RepeatConcat[E](elem)RepeatSourceOutHandlerso eachonPullpusheselemFailedConcat[E](failure)FailedSourcefailStage(failure)FutureConcat[E](future)FutureSourceOutHandlerto a no-op (avoid pulling closedin), register async callback that emits/fails when resolved.internalConcatis extended to dispatch viaTraversalBuilder.getValuePresentedSourceand pattern-match the eight value-presented source types (existingSingleSourcepath is preserved). Thedetachedflag is irrelevant for these stages — the right-hand data is already present, so the one-element pre-fetch buffer thatdetached=trueprovides has nothing to fetch (matchingSingleConcat's precedent).Result
For the eight value-presented source types,
concatandconcatLazyno longer pay for substream materialization or the two-port fan-in graph. Observable behavior is unchanged for all other sources, which still take the existingconcatGraphpath.Tests
Eleven directional tests added to
AbstractFlowConcatSpec(cover bothconcatandconcatLazypaths):optimize iterable concat/range concat/iterator concat/java-stream concat— assertIterableConcatin the traversal builder, output correct.optimize repeat concat— assertRepeatConcat(0), bounded with.take(6).optimize failed concat— assertFailedConcat, error propagated.optimize completed-future concat— confirms it routes throughSingleConcat(upstream optimization).optimize pending-future concat— uses an unresolvedPromise, assertsFutureConcatand correct delivery after resolution.optimize failed-future concat— confirms it routes throughFailedConcat(upstream optimization).avoid downstream substream materialization for value-presented sources— counter-based assertion of zero substream materialization.Verification commands:
All
*ConcatSpec(96 tests) pass. MiMa clean (all new stages areprivate[pekko]/InternalApi). Fullstream-tests/test(3027 tests) passes; the only failure was an unrelatedBoundedSourceQueueSpecflake that does not useconcatand passes in isolation.References
Stacked on #2977.