Skip to content

Restore resource_cast memory limit check#1231

Open
bdice wants to merge 2 commits into
NVIDIA:release/26.06from
bdice:restore-resource-cast-memory-limit
Open

Restore resource_cast memory limit check#1231
bdice wants to merge 2 commits into
NVIDIA:release/26.06from
bdice:restore-resource-cast-memory-limit

Conversation

@bdice
Copy link
Copy Markdown
Contributor

@bdice bdice commented May 17, 2026

Description

Restore the memory-limit-aware branch in get_device_memory_size() that was temporarily removed during the CCCL/RMM memory-resource migration in #1035.

The function now uses cuda::mr::resource_cast on rmm::mr::get_current_device_resource_ref() to detect an active rmm::mr::limiting_resource_adaptor. When present, it reports the limit/usage/free values and returns the smaller of CUDA total memory and the adaptor allocation limit, matching the previous behavior.

@bdice bdice requested a review from a team as a code owner May 17, 2026 16:23
@bdice bdice requested review from Kh4ster and rg20 May 17, 2026 16:23
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ffca1565-d87f-435f-b8c7-6c9de3825c98

📥 Commits

Reviewing files that changed from the base of the PR and between 7d66043 and aa2d665.

📒 Files selected for processing (1)
  • cpp/src/utilities/cuda_helpers.cuh

📝 Walkthrough

Walkthrough

This PR updates the CUDA helpers header to include RMM/RAFT memory-resource headers and changes get_device_memory_size() to consult the current RMM device resource, cast to a limiting_resource_adaptor if available, and return min(cuda-reported total_mem, adaptor allocation_limit); otherwise it returns total_mem.

Changes

Device memory reporting with RMM limiting adaptors

Layer / File(s) Summary
RMM resource headers and dependencies
cpp/src/utilities/cuda_helpers.cuh
CUDA memory-resource, RAFT utilities, and RMM limiting/per-device adaptor headers are added to enable limiting adaptor detection and casting.
Device memory size query with limiting adaptor detection
cpp/src/utilities/cuda_helpers.cuh
get_device_memory_size() now reads CUDA total memory via cudaMemGetInfo, inspects the current RMM device resource and attempts to cast it to rmm::mr::limiting_resource_adaptor; if present returns min(total_mem, allocation_limit), otherwise returns total_mem.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: restoring a resource_cast memory limit check in get_device_memory_size().
Description check ✅ Passed The description is directly related to the changeset, explaining the restoration of the memory-limit-aware branch and how resource_cast is used.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cpp/src/utilities/cuda_helpers.cuh (1)

254-259: ⚡ Quick win

Gate debug prints behind a debug flag.

Unconditional printf in get_device_memory_size() can spam stdout and add sync overhead when this helper is called frequently. Consider guarding these lines with a compile-time debug flag.

Suggested patch
-    printf("limiting_adaptor->get_allocation_limit(): %fMiB\n",
-           limiting_adaptor->get_allocation_limit() / (double)1e6);
-    printf("used_mem: %fMiB\n", limiting_adaptor->get_allocated_bytes() / (double)1e6);
-    printf("free_mem: %fMiB\n",
-           (limiting_adaptor->get_allocation_limit() - limiting_adaptor->get_allocated_bytes()) /
-             (double)1e6);
+#ifdef CUOPT_ENABLE_RESOURCE_LIMIT_DEBUG
+    printf("limiting_adaptor->get_allocation_limit(): %fMiB\n",
+           limiting_adaptor->get_allocation_limit() / (double)1e6);
+    printf("used_mem: %fMiB\n", limiting_adaptor->get_allocated_bytes() / (double)1e6);
+    printf("free_mem: %fMiB\n",
+           (limiting_adaptor->get_allocation_limit() - limiting_adaptor->get_allocated_bytes()) /
+             (double)1e6);
+#endif
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/utilities/cuda_helpers.cuh` around lines 254 - 259, The three
unconditional printf calls inside get_device_memory_size() (printing
limiting_adaptor->get_allocation_limit(), get_allocated_bytes(), and free
memory) should be wrapped in a compile-time debug macro so they don't run in
production; add a macro guard (e.g., CUDA_HELPERS_DEBUG or similar) around those
prints in cpp/src/utilities/cuda_helpers.cuh and only emit the diagnostics when
the macro is defined, keeping the rest of get_device_memory_size() unchanged.
Ensure the macro name is documented in the header comment so callers can enable
it at build time.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/utilities/cuda_helpers.cuh`:
- Around line 254-259: The printed memory diagnostics use the label "MiB" but
divide by 1e6 (decimal MB); update the three printf calls that reference
limiting_adaptor->get_allocation_limit() and
limiting_adaptor->get_allocated_bytes() to divide by 1024.0 * 1024.0 (or use a
named constant like BYTES_PER_MIB) so the units correctly reflect mebibytes, and
keep the "MiB" labels unchanged.

---

Nitpick comments:
In `@cpp/src/utilities/cuda_helpers.cuh`:
- Around line 254-259: The three unconditional printf calls inside
get_device_memory_size() (printing limiting_adaptor->get_allocation_limit(),
get_allocated_bytes(), and free memory) should be wrapped in a compile-time
debug macro so they don't run in production; add a macro guard (e.g.,
CUDA_HELPERS_DEBUG or similar) around those prints in
cpp/src/utilities/cuda_helpers.cuh and only emit the diagnostics when the macro
is defined, keeping the rest of get_device_memory_size() unchanged. Ensure the
macro name is documented in the header comment so callers can enable it at build
time.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 430867ae-8a10-47b2-948f-2993d59e032c

📥 Commits

Reviewing files that changed from the base of the PR and between 3e60a3a and 7d66043.

📒 Files selected for processing (1)
  • cpp/src/utilities/cuda_helpers.cuh

Comment thread cpp/src/utilities/cuda_helpers.cuh Outdated
@bdice bdice self-assigned this May 17, 2026
@bdice bdice added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels May 17, 2026
@anandhkb anandhkb added this to the 26.06 milestone May 18, 2026
@anandhkb
Copy link
Copy Markdown
Contributor

Requesting @rg20 's review

@anandhkb anandhkb added the P0 label May 19, 2026
Copy link
Copy Markdown
Contributor

@rg20 rg20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice is this still needed?

Comment thread cpp/src/utilities/cuda_helpers.cuh Outdated
auto res = rmm::mr::get_current_device_resource_ref();
auto limiting_adaptor = cuda::mr::resource_cast<rmm::mr::limiting_resource_adaptor>(&res);
if (limiting_adaptor) {
printf("limiting_adaptor->get_allocation_limit(): %fMiB\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the printf

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just restoring previous code. I can remove it, just noting that I didn't originally add it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed in aa2d665.

@rgsl888prabhu rgsl888prabhu changed the base branch from main to release/26.06 May 20, 2026 17:30
@rg20 rg20 modified the milestones: 26.06, 26.08 May 26, 2026
@bdice bdice requested a review from rg20 May 28, 2026 15:05
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

🔔 Hi @anandhkb @bdice, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change P0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants