Skip to content

optimize: FlattenMerge avoids substream materialization for value-presented sources#2977

Open
He-Pin wants to merge 2 commits into
apache:mainfrom
He-Pin:optimize-flatten-merge-avoid-materialization
Open

optimize: FlattenMerge avoids substream materialization for value-presented sources#2977
He-Pin wants to merge 2 commits into
apache:mainfrom
He-Pin:optimize-flatten-merge-avoid-materialization

Conversation

@He-Pin
Copy link
Copy Markdown
Member

@He-Pin He-Pin commented May 17, 2026

Motivation

FlattenMerge previously only fast-pathed SingleSource. Every other
inner source — Source.empty, Source(List), Source.fromJavaStream,
Source.future(Future.successful(...)), range / iterator / repeat
sources — paid the cost of materializing a SubSinkInlet plus a full
subFusingMaterializer.materialize(...) round-trip. FlattenConcat
already eliminates this via TraversalBuilder.getValuePresentedSource
(introduced in 1.2.0); FlattenMerge should benefit too.

This is especially important because the single-argument
flatMapConcat(f) is implemented as via(new FlattenMerge(1)), so all
default flatMapConcat users go through FlattenMerge and miss out on
the value-presented optimization until now.

Modification

  • Switch FlattenMerge.addSource from getSingleSource to
    getValuePresentedSource, dispatching on SingleSource,
    IterableSource, IteratorSource, RangeSource, RepeatSource,
    JavaStreamSource, FutureSource, FailedSource, and empty sources
    in place — mirroring FlattenConcat.
  • Introduce an InflightSource[T] family in a new FlattenMerge
    companion to occupy a breadth slot for multi-element value-presented
    sources without materializing a substream. Tracked via a new
    pendingInflightSources counter so activeSources still bounds the
    breadth budget correctly.
  • Preserve merge semantics: after each push from a multi-element
    inflight source, re-enqueue it so other concurrent sources keep
    interleaving (rather than draining one source first as FlattenConcat
    does).
  • Completed Futures / FailedSource are folded directly: success
    pushes or queues a single element; failure calls failStage.
  • Pending Futures register an getAsyncCallback and occupy a slot
    until completion.
  • Empty inner sources are discarded in place and consume no slot.

FlattenMerge is @InternalApi private[pekko] so this is purely an
internal performance change; MiMa is clean.

Result

  • flatMapMerge(breadth, ...) and the default flatMapConcat(...)
    (which routes through FlattenMerge(1)) skip substream materialization
    for value-presented inner sources, cutting per-source GC and stage
    overhead.
  • Behaviour is unchanged for non-value-presented sources (regular
    Sources built from graphs, lazy futures, etc.) — they still
    materialize a SubSinkInlet as before.

Tests

  • All existing FlowFlattenMergeSpec tests pass.
  • All existing FlowFlatMapConcatParallelismSpec tests pass (covers the
    FlattenMerge(1) path used by single-arg flatMapConcat).
  • New tests in FlowFlattenMergeSpec mirror
    FlowFlatMapConcatParallelismSpec:
    • work with value presented sources with breadth: {1,2,4,8,16,32,64,128}
      — covers empty / single / iterable / completed-future / lazy-future /
      delayed-future inner sources.
    • work with generated value presented sources with breadth: ...
      randomised mix of value-presented sources at scale.
    • work with value presented failed sources — failure inside the
      optimized path.
    • avoid pre-materialization for value-presented sources and
      not materialize value-presented sources — assert the fast path
      actually fires.

Local commands run:

  • sbt "stream-tests/testOnly org.apache.pekko.stream.scaladsl.FlowFlattenMergeSpec" → 37/37 pass
  • sbt "stream-tests/testOnly *FlatMap* *Flatten* *PrefixAndTail* *Concat*Spec" → 232/232 pass
  • sbt scalafmtAll headerCreateAll
  • sbt "stream/mimaReportBinaryIssues" → no issues

References

  • Mirrors the value-presented-source optimization added to
    FlattenConcat in 1.2.0 (TraversalBuilder.getValuePresentedSource).

He-Pin added 2 commits May 17, 2026 18:09
…sented sources

Motivation:
FlattenMerge previously only fast-pathed `SingleSource`. For all other
inner sources -- `Source.empty`, `Source(List)`, `Source.fromJavaStream`,
`Source.future(Future.successful(...))`, range/iterator/repeat sources --
each one paid the cost of materializing a `SubSinkInlet` and
`subFusingMaterializer.materialize(...)`. FlattenConcat already does this
optimization via `TraversalBuilder.getValuePresentedSource`; FlattenMerge
should benefit too, especially because the single-arg `flatMapConcat(f)`
internally uses `FlattenMerge(1)` and so depends on FlattenMerge for its
hot path.

Modification:
- Generalize FlattenMerge to dispatch on `getValuePresentedSource`
  (instead of `getSingleSource`) and consume `SingleSource`,
  `IterableSource`, `IteratorSource`, `RangeSource`, `RepeatSource`,
  `JavaStreamSource`, `FutureSource`, `FailedSource`, and empty sources
  in-place without materialization, mirroring FlattenConcat.
- Add an `InflightSource[T]` family inside the new `FlattenMerge`
  companion to occupy a breadth slot for multi-element value-presented
  sources. Track them via a new `pendingInflightSources` counter so
  `activeSources` correctly bounds the breadth budget.
- Preserve merge semantics: when an inflight source still has more
  elements after a push, re-enqueue it so other concurrent sources keep
  interleaving (instead of draining one source first, which is the
  concat behaviour).
- Fold completed `Future`s and `FailedSource` directly: success pushes
  or queues a single element, failure calls `failStage`.
- Pending `Future`s register a callback via `getAsyncCallback` and
  occupy a breadth slot until completion.
- Empty inner sources are discarded in place (no slot consumed).

Result:
- `flatMapMerge(breadth, ...)` and the default `flatMapConcat(...)`
  (which routes through `FlattenMerge(1)`) skip substream materialization
  for value-presented inner sources, reducing per-source GC and stage
  overhead.
- All existing FlattenMerge / flatMapConcat tests pass; new tests cover
  empty / single / iterable / range / java stream / completed and
  delayed future / failed inner sources across breadth = 1..128.
- Internal API only (`@InternalApi private[pekko]`); MiMa is clean.

References:
- The optimization mirrors FlattenConcat's value-presented-source
  handling introduced in 1.2.0.
…enMerge

Motivation:
After the previous commit, FlattenMerge grew its own copy of the
`InflightSource[T]` hierarchy (Iterator/Range/Repeat/CompletedFuture/
PendingFuture) duplicating what FlattenConcat already had in its
companion object. Two near-identical families across two files is a
maintenance hazard: any future tweak to the value-presented optimization
(e.g. adding a new source type, fixing a Java-stream cleanup leak) would
have to be mirrored, and the families had already drifted in small ways
(e.g. `tryPull`/`cancel`/`materialize` declared abstract in concat with
no-op overrides on every subclass; concat used `isClosed = true` for the
completed-future variant while merge used `!_hasNext`).

Modification:
- Extract the common `InflightSource[T]` base and the five value-presented
  subclasses (Iterator/Range/Repeat/CompletedFuture/PendingFuture) into
  a new `pekko.stream.impl.fusing.InflightSources` package-private object.
- Promote `tryPull` / `cancel` / `materialize` from abstract to concrete
  no-op defaults, so the value-presented subclasses no longer carry empty
  overrides. Stages that wrap a real `SubSinkInlet` (only FlattenConcat's
  `attachAndMaterializeSource` does this) override what they need.
- Align `InflightCompletedFutureSource.isClosed` to FlattenConcat's
  `true` semantics — behaviorally equivalent in both stages, but more
  faithful to the source being a one-shot cached value.
- Drop the `sealed` modifier on `InflightSource` so FlattenConcat's
  attached-substream anonymous subclass can still extend it from another
  file in the same package.
- Remove the duplicate definitions from FlattenConcat's companion (now
  unused, drop the empty companion entirely) and from FlattenMerge's
  companion. Both stages import from the shared object instead.

Result:
- Net -176 lines of duplication; one canonical home for the
  optimization's data types.
- Future additions (e.g. extending the optimization to other stream-of-
  streams stages such as `MergeMany`-style operators) only need to
  reference `InflightSources`.
- All FlattenConcat / FlattenMerge / flatMapConcat parallelism tests
  remain green; MiMa is clean (`@InternalApi private[fusing]`).
@He-Pin He-Pin force-pushed the optimize-flatten-merge-avoid-materialization branch from 8f297aa to 81c6aa4 Compare May 17, 2026 11:26
@He-Pin He-Pin added the t:stream Pekko Streams label May 17, 2026
@He-Pin He-Pin added this to the 2.0.0-M3 milestone May 17, 2026
@He-Pin He-Pin requested a review from pjfanning May 17, 2026 11:39
@He-Pin
Copy link
Copy Markdown
Member Author

He-Pin commented May 17, 2026

I will do a follow up optimization around the ++ operation once this got merged.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines +144 to +146
case javaStream: JavaStreamSource[T, _] @unchecked => addInflightIteratorSource(
javaStream.open().iterator.asScala.asInstanceOf[Iterator[T]])
case failed: FailedSource[T] @unchecked => addCompletedFutureElem(Failure(failed.failure))
Comment on lines +375 to +395
s"not materialize value-presented sources, breadth = $breadth" in {
val materializationCounter = new AtomicInteger(0)
// SingleSource / IterableSource / RangeSource / FutureSource etc. are value-presented;
// FlattenMerge should consume them directly without paying for substream materialization.
// We attach a side effect to a non-value-presented wrapper to detect any unwanted materialization.
val n = breadth * 3
val probe = Source(1 to n)
.flatMapMerge(breadth, value => Source.single(value))
.map { v =>
materializationCounter.incrementAndGet()
v
}
.runWith(TestSink())

probe.request(n.toLong)
probe.expectNextN(n.toLong).toSet should ===((1 to n).toSet)
probe.expectComplete()
// The downstream map fires once per emitted element, but each Source.single
// is consumed without spinning up its own substream materialization.
materializationCounter.get() shouldBe n
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t:stream Pekko Streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants