Skip to content

Add env DeviceTopK without temp_storage args#8982

Open
gonidelis wants to merge 6 commits into
NVIDIA:mainfrom
gonidelis:env_topk
Open

Add env DeviceTopK without temp_storage args#8982
gonidelis wants to merge 6 commits into
NVIDIA:mainfrom
gonidelis:env_topk

Conversation

@gonidelis
Copy link
Copy Markdown
Member

fixes #8969

@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented May 14, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Progress in CCCL May 14, 2026
@gonidelis gonidelis marked this pull request as ready for review May 14, 2026 05:56
@gonidelis gonidelis requested a review from a team as a code owner May 14, 2026 05:56
@cccl-authenticator-app cccl-authenticator-app Bot moved this from In Progress to In Review in CCCL May 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2e0f0a05-822e-4ee2-b14d-af5ea86709d4

📥 Commits

Reviewing files that changed from the base of the PR and between 1637e52 and 776655d.

📒 Files selected for processing (1)
  • cub/test/catch2_test_device_topk_env.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cub/test/catch2_test_device_topk_env.cu

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added environment-based DeviceTopK overloads for Max/Min Keys and Pairs that manage temporary storage internally and accept an execution environment (e.g., stream, determinism, ordering).
  • Tests

    • Added unit tests validating env-based TopK behavior for built-in and custom key types (with decomposer), including stream-based execution and edge cases; host-side sorting used to verify correctness.

important:

Walkthrough

Added env-dispatched DeviceTopK overloads (Max/Min Keys and Pairs, each with decomposer and non-decomposer variants) that allocate temporary storage via the execution environment, and added tests validating these overloads with built-in and custom types.

Changes

Environment-Based TopK API

Layer / File(s) Summary
Environment dispatch integration
cub/cub/device/device_topk.cuh
Added #include for cub/detail/env_dispatch.cuh to enable env-dispatched temporary-storage allocation.
MaxPairs environment overloads
cub/cub/device/device_topk.cuh
Two env-based DeviceTopK::MaxPairs variants (non-decomposer and decomposer) that internally allocate temp storage via detail::dispatch_with_env and forward to detail::dispatch_topk with topk::select::max.
MinPairs environment overloads
cub/cub/device/device_topk.cuh
Two env-based DeviceTopK::MinPairs variants (non-decomposer and decomposer) that allocate temp storage via detail::dispatch_with_env, include API docs updates, and forward to detail::dispatch_topk with topk::select::min.
MaxKeys environment overloads
cub/cub/device/device_topk.cuh
Two env-based DeviceTopK::MaxKeys variants (non-decomposer and decomposer) for key-only path; allocate temp storage via detail::dispatch_with_env and call detail::dispatch_topk with topk::select::max.
MinKeys environment overloads
cub/cub/device/device_topk.cuh
Two env-based DeviceTopK::MinKeys variants (non-decomposer and decomposer) for key-only path; allocate temp storage via detail::dispatch_with_env and call detail::dispatch_topk with topk::select::min.
Environment-based API test coverage
cub/test/catch2_test_device_topk_env_api.cu
New test TU with topk_custom_t and topk_custom_decomposer_t; six Catch2 tests exercising Max/Min Keys and Pairs using env + cuda::stream_ref for built-in and decomposer inputs.
Existing test suite expansion
cub/test/catch2_test_device_topk_env.cu
Added thrust/host_vector and <algorithm>, sorted_top_k helper, multiple env-based tests validating outputs against host-sorted expectations and edge case where K == num_items.

Assessment against linked issues

Objective Addressed Explanation
Add env-overloads for MaxPairs, MinPairs, MaxKeys, MinKeys that handle memory allocation using memory resource from environment [#8969]
Eliminate requirement for caller-managed d_temp_storage and temp_storage_bytes parameters in environment-based API [#8969]
Provide test coverage for new environment-based overloads with standard and custom types [#8969]

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cub/test/catch2_test_device_topk_env_api.cu (1)

4-4: 💤 Low value

suggestion: The include insert_nested_NVTX_range_guard.h is not used in this file. Consider removing it.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0a4f69d0-88d3-41d1-bc72-f0a079b4fc3e

📥 Commits

Reviewing files that changed from the base of the PR and between 5d79bc2 and 1cd8c3d.

📒 Files selected for processing (3)
  • cub/cub/device/device_topk.cuh
  • cub/test/catch2_test_device_topk_api.cu
  • cub/test/catch2_test_device_topk_env_api.cu

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1c53e6de-7677-45e1-9484-0d81da4be3ae

📥 Commits

Reviewing files that changed from the base of the PR and between 1cd8c3d and 1637e52.

📒 Files selected for processing (2)
  • cub/cub/device/device_topk.cuh
  • cub/test/catch2_test_device_topk_env.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cub/cub/device/device_topk.cuh

Comment thread cub/test/catch2_test_device_topk_env.cu
@github-actions
Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 1h 24m: Pass: 100%/283 | Total: 2d 21h | Max: 1h 23m | Hits: 92%/214138

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

cub::DeviceTopK does not have env-overloads handling memory allocation

1 participant