[AURON #1840] Preserve collect_set first-occurrence order#2285
[AURON #1840] Preserve collect_set first-occurrence order#2285peter941221 wants to merge 5 commits into
Conversation
46ef225 to
f092709
Compare
|
Rebased this onto current master and kept the fix minimal. The change still just removes the size-based swap in AccSet::merge and adds order assertions for the merge cases. New CI is running on f092709. |
|
do we really need to keep this order? if i dont understand wrong, this order we be discarded in bucket-merge spill strategy even if we maintain it in updating |
|
@richox thanks for calling out the spill path. I checked that path and added a focused regression for it. The spill bucket merge still comes back through the same I also checked the spill round-trip itself. To make that concrete, I pushed So I think we still need this fix for the spill path too. |
There was a problem hiding this comment.
Pull request overview
This PR aims to match Spark’s collect_set no-shuffle behavior by preserving first-occurrence ordering during accumulator merges, avoiding RHS-first ordering caused by size-based swapping.
Changes:
- Removed
AccSet::merge’s size-based accumulator swap to preserve encounter order. - Added regression tests to validate first-occurrence order, including a spill/unspill merge path.
- Added a source-compatible overload for
NativeHelper.getDefaultNativeMetricsto support an in-progress keyed migration.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| spark-extension/src/main/scala/org/apache/spark/sql/auron/NativeHelper.scala | Adds a compatibility overload delegating to the keyed metrics API. |
| native-engine/datafusion-ext-plans/src/agg/collect.rs | Adjusts AccSet::merge behavior and expands tests to cover ordering expectations. |
Comments suppressed due to low confidence (1)
native-engine/datafusion-ext-plans/src/agg/collect.rs:567
AccSet::mergedrainsother.setbut leavesother.list.rawintact. This breaksAccSetinvariants (list contains values not reflected in the set), can cause merged-away values to be serialized/spilled later viaAccSetColumn::save_raw, and makesAccSetColumn::mem_usedaccounting incorrect becausemerge_itemssubtractsother_value_mem_sizeeven though the underlying list memory is still retained. Also, forInternalSet::Huge, iterating the hash table does not preserve first-occurrence order; sorting by the stored position offset restores encounter order for large sets.
pub fn merge(&mut self, other: &mut Self) {
for pos_len in std::mem::take(&mut other.set).into_iter() {
self.append_raw(other.list.ref_raw(pos_len));
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
What changed
AccSet::mergeno longer swaps accumulators by set size. That swap changed the encounter order ofcollect_set, so no-shuffle Spark checks could see values inrhs-firstorder instead of first-occurrence order.Why
Spark's
collect_setpreserves first-occurrence order in the no-shuffle path used by the affected aggregate suite.Testing
git diff --checkcargo +nightly test --manifest-path native-engine/datafusion-ext-plans/Cargo.toml test_acc_set_merge -- --nocapture(blocked here byrdkafka-sysWindows build failure:%1 is not a valid Win32 application. (os error 193))