[STF] Add per-handle exec_place stream resources#8905
Conversation
Move pooled exec_place streams into an explicit resources registry so STF contexts own their cached streams and avoid stale handles after CUDA context teardown.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
📝 WalkthroughSummary by CodeRabbit
suggestion: This PR refactors CUDA stream pool lifecycle from per- suggestion: Stream Pool Registry Refactor
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a7a53086-da15-44b1-9f06-0ee688df1ddc
📒 Files selected for processing (19)
cudax/include/cuda/experimental/__places/exec/cuda_stream.cuhcudax/include/cuda/experimental/__places/exec/green_context.cuhcudax/include/cuda/experimental/__places/exec_place_resources.cuhcudax/include/cuda/experimental/__places/place_partition.cuhcudax/include/cuda/experimental/__places/places.cuhcudax/include/cuda/experimental/__stf/graph/graph_task.cuhcudax/include/cuda/experimental/__stf/internal/async_resources_handle.cuhcudax/include/cuda/experimental/__stf/internal/backend_ctx.cuhcudax/include/cuda/experimental/__stf/internal/stf_places_extended_exports.cuhcudax/include/cuda/experimental/__stf/stream/interfaces/slice.cuhcudax/include/cuda/experimental/__stf/stream/internal/event_types.cuhcudax/include/cuda/experimental/__stf/stream/reduction.cuhcudax/include/cuda/experimental/__stf/stream/stream_ctx.cuhcudax/include/cuda/experimental/__stf/stream/stream_task.cuhcudax/test/places/stream_pool.cucudax/test/stf/CMakeLists.txtcudax/test/stf/cpp/test_pick_stream.cucudax/test/stf/cpp/test_pick_stream_green_context.cudocs/cudax/places.rst
| inline cudaStream_t | ||
| exec_place::pick_stream(::cuda::experimental::stf::async_resources_handle& h, bool for_computation) const | ||
| { | ||
| return pick_stream(h.get_place_resources(), for_computation); | ||
| } |
There was a problem hiding this comment.
critical: Add required API annotation and [[nodiscard]] attribute.
-inline cudaStream_t
-exec_place::pick_stream(::cuda::experimental::stf::async_resources_handle& h, bool for_computation) const
+[[nodiscard]] inline _CCCL_HOST_API cudaStream_t
+exec_place::pick_stream(::cuda::experimental::stf::async_resources_handle& h, bool for_computation) constAs per coding guidelines: All functions must have API annotations, and non-void returns should be [[nodiscard]].
| inline size_t exec_place::stream_pool_size(::cuda::experimental::stf::async_resources_handle& h) const | ||
| { | ||
| return stream_pool_size(h.get_place_resources()); | ||
| } |
There was a problem hiding this comment.
critical: Add required API annotation and [[nodiscard]] attribute.
-inline size_t exec_place::stream_pool_size(::cuda::experimental::stf::async_resources_handle& h) const
+[[nodiscard]] inline _CCCL_HOST_API ::std::size_t exec_place::stream_pool_size(::cuda::experimental::stf::async_resources_handle& h) constAlso fully qualify size_t as ::std::size_t.
As per coding guidelines: All functions must have API annotations, non-void returns should be [[nodiscard]], and standard types must be fully qualified.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| inline size_t exec_place::stream_pool_size(::cuda::experimental::stf::async_resources_handle& h) const | |
| { | |
| return stream_pool_size(h.get_place_resources()); | |
| } | |
| [[nodiscard]] inline _CCCL_HOST_API ::std::size_t exec_place::stream_pool_size(::cuda::experimental::stf::async_resources_handle& h) const | |
| { | |
| return stream_pool_size(h.get_place_resources()); | |
| } |
- Add [[nodiscard]] to non-void returns on exec_place_resources::get/size, the virtual get_stream_pool, async_resources_handle::get_place_resources, and the 5 async_resources_handle& convenience overloads on exec_place. - Mark dstream const in stream_reduction_operator op/init overrides. - Clarify pick_stream() call form in docs/cudax/places.rst.
Add the C entry points needed to share per-place stream pools between standalone place code and STF contexts.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9c5fb806-40f1-4b93-a2ec-4ee222462b9e
📒 Files selected for processing (8)
c/experimental/stf/include/cccl/c/experimental/stf/stf.hc/experimental/stf/src/stf.cuc/experimental/stf/test/test_places.cppcudax/include/cuda/experimental/__places/exec_place_resources.cuhcudax/include/cuda/experimental/__places/places.cuhcudax/include/cuda/experimental/__stf/internal/async_resources_handle.cuhcudax/include/cuda/experimental/__stf/stream/reduction.cuhdocs/cudax/places.rst
✅ Files skipped from review due to trivial changes (1)
- docs/cudax/places.rst
🚧 Files skipped from review as they are similar to previous changes (4)
- cudax/include/cuda/experimental/__stf/stream/reduction.cuh
- cudax/include/cuda/experimental/__stf/internal/async_resources_handle.cuh
- cudax/include/cuda/experimental/__places/exec_place_resources.cuh
- cudax/include/cuda/experimental/__places/places.cuh
Track whether C resource handles own their underlying registry so borrowed context resources can be released without corrupting context teardown.
|
/ok to test 12c5ee9 |
🥳 CI Workflow Results🟩 Finished in 3h 39m: Pass: 100%/59 | Total: 1d 13h | Max: 1h 21m | Hits: 7%/255304See results here. |
Move pooled exec_place streams into an explicit resources registry so STF contexts own their cached streams and avoid stale handles after CUDA context teardown.
Description
closes
Checklist