Use the new tuning API internally for detail::select|three_way_partition::dispatch and DevicePartition#8925
Conversation
This comment has been minimized.
This comment has been minimized.
2202fb7 to
f978ca6
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (6)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughSummary by CodeRabbit
suggestion: WalkthroughThis PR rewires DevicePartition/select dispatch to the new tuning API, replaces manual temp-size/allocation with env-based dispatch and policy_selector functors in benchmarks, updates Thrust partition dispatch, and adds tuning tests validating the tuned execution paths. ChangesTuning API Integration for DevicePartition and Select Dispatch
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2e262785-9664-4f8f-b738-a61ffdd14e4e
📒 Files selected for processing (6)
cub/benchmarks/bench/partition/flagged.cucub/benchmarks/bench/partition/if.cucub/benchmarks/bench/partition/three_way.cucub/cub/device/device_partition.cuhcub/test/catch2_test_device_partition_env.cuthrust/thrust/system/cuda/detail/partition.h
b03fd61 to
5d6ec8e
Compare
🥳 CI Workflow Results🟩 Finished in 1h 52m: Pass: 100%/340 | Total: 7d 14h | Max: 1h 51m | Hits: 62%/627542See results here. |
…ion::dispatch and DevicePartition Fixes: NVIDIA#8879 Fixes: NVIDIA#8380
5d6ec8e to
2e36a8b
Compare
| { | ||
| template <typename Derived, typename InputIt, typename StencilIt, typename OutputIt, typename Predicate, typename OffsetT> | ||
| struct DispatchPartitionIf | ||
| cudaError_t THRUST_RUNTIME_FUNCTION dispatch_partition( |
There was a problem hiding this comment.
| cudaError_t THRUST_RUNTIME_FUNCTION dispatch_partition( | |
| [[nodiscard]] cudaError_t THRUST_RUNTIME_FUNCTION dispatch_partition( |
cub.bench.partition.three_way.baseon SM75;80;86;90;100cub.bench.partition.if.baseon SM75;80;86;90;100cub.bench.partition.flagged.baseon SM75;80;86;90;100Fixes: #8879
Fixes: #8380