Skip to content

Refactor extraction of tuning policy selectors#8975

Open
bernhardmgruber wants to merge 1 commit into
NVIDIA:mainfrom
bernhardmgruber:ref_tuning_extraction
Open

Refactor extraction of tuning policy selectors#8975
bernhardmgruber wants to merge 1 commit into
NVIDIA:mainfrom
bernhardmgruber:ref_tuning_extraction

Conversation

@bernhardmgruber
Copy link
Copy Markdown
Contributor

I realized that dispatch_with_env_and_tuning is not a good default mechanism for handling the policy selector extraction and defaulting, since it will lead to a lot of redundancy for setting up the default policy selector.

In this PR I am providing two alternatives:

  1. for CUB APIs not exposed by CCCL.C we can pass the tuning environment to the dispatch function and handle the extraction there. This would not work with CCCL.C, which relies on stateful policy selectors.
  2. for CUB API exposed by CCCL.C we provide a single wrapper function of the dispatch function in the device layer.

@bernhardmgruber bernhardmgruber requested a review from a team as a code owner May 13, 2026 21:54
@github-project-automation github-project-automation Bot moved this to Todo in CCCL May 13, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL May 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Optimized internal dispatch mechanisms for adjacent difference, merge, and merge-sort device algorithms to use environment-based policy selection, improving code maintainability and consistency across the library.

Walkthrough

Three device algorithms (adjacent difference, merge, merge-sort) refactor their dispatch mechanisms from explicit PolicySelector template parameters to environment-based TuningEnvT forwarding. Dispatch layer implementations now derive policy selectors by querying the tuning environment, while algorithm overloads replace dispatch_with_env_and_tuning with dispatch_with_env plus direct tuning_env propagation.

Changes

Environment-based dispatch refactoring for adjacent difference, merge, and merge-sort

Layer / File(s) Summary
Dispatch layer infrastructure refactoring
cub/cub/device/dispatch/dispatch_adjacent_difference.cuh, cub/cub/device/dispatch/dispatch_merge.cuh
Both dispatch implementations add cuda/std/__execution/env.h, replace PolicySelector template parameter with TuningEnvT, and derive policy_selector_t from the tuning environment via __query_result_or_t. Kernel instantiations updated to use derived policy_selector_t.
Adjacent difference and merge algorithm integration
cub/cub/device/device_adjacent_difference.cuh, cub/cub/device/device_merge.cuh
Four adjacent-difference overloads (SubtractLeftCopy, SubtractLeft, SubtractRightCopy, SubtractRight) and two merge overloads (MergeKeys, MergePairs) switch from dispatch_with_env_and_tuning to dispatch_with_env, capturing tuning_env and forwarding it into dispatch implementations.
MergeSort algorithm with tuning helper
cub/cub/device/device_merge_sort.cuh
New private select_tuning_and_dispatch helper queries tuning environment for merge-sort policy and dispatches to detail::merge_sort::dispatch. Seven environment overloads (SortPairs, SortPairsCopy, SortKeys, SortKeysCopy, StableSortPairs, StableSortKeys, StableSortKeysCopy) adopt dispatch_with_env plus the new helper.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dc6a4ef3-3b13-47e5-b4d3-eff83e9bb37b

📥 Commits

Reviewing files that changed from the base of the PR and between 22912a7 and 2e33c08.

📒 Files selected for processing (5)
  • cub/cub/device/device_adjacent_difference.cuh
  • cub/cub/device/device_merge.cuh
  • cub/cub/device/device_merge_sort.cuh
  • cub/cub/device/dispatch/dispatch_adjacent_difference.cuh
  • cub/cub/device/dispatch/dispatch_merge.cuh

Comment thread cub/cub/device/device_merge_sort.cuh
@github-actions
Copy link
Copy Markdown
Contributor

😬 CI Workflow Results

🟥 Finished in 2h 21m: Pass: 37%/283 | Total: 4d 08h | Max: 2h 20m | Hits: 37%/354677

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

1 participant