Skip to content

[STF] Properly destroy CUDA streams and do not try to initialize CUDA while capturing#8919

Open
caugonnet wants to merge 4 commits into
NVIDIA:mainfrom
caugonnet:stream-ctx-capture-safe
Open

[STF] Properly destroy CUDA streams and do not try to initialize CUDA while capturing#8919
caugonnet wants to merge 4 commits into
NVIDIA:mainfrom
caugonnet:stream-ctx-capture-safe

Conversation

@caugonnet
Copy link
Copy Markdown
Contributor

Destroy pool-owned streams with the stream pool and initialize the CUDA runtime only once so consecutive stream_ctx instances on a caller stream serialize without explicit synchronization.

Description

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Destroy pool-owned streams with the stream pool and initialize the CUDA runtime only once so consecutive stream_ctx instances on a caller stream serialize without explicit synchronization.
@caugonnet caugonnet self-assigned this May 12, 2026
@caugonnet caugonnet added the stf Sequential Task Flow programming model label May 12, 2026
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented May 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-project-automation github-project-automation Bot moved this to Todo in CCCL May 12, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Progress in CCCL May 12, 2026
caugonnet added 2 commits May 12, 2026 12:15
Describe the runtime initialization invariant without relying on implementation history.
Skip CUDA runtime initialization when constructing a stream_ctx from an already-capturing user stream; the stream itself implies CUDA is initialized, and normal contexts still initialize before issuing work.
@caugonnet caugonnet changed the title [STF] Chain back-to-back stream contexts [STF] Properly destroy CUDA streams and do not try to initialize CUDA while capturing May 12, 2026
@caugonnet caugonnet marked this pull request as ready for review May 12, 2026 11:38
@caugonnet caugonnet requested review from a team as code owners May 12, 2026 11:38
@caugonnet caugonnet requested review from alliepiper and andralex May 12, 2026 11:38
@cccl-authenticator-app cccl-authenticator-app Bot moved this from In Progress to In Review in CCCL May 12, 2026
@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 7059d3b

@github-actions
Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 1h 21m: Pass: 100%/55 | Total: 1d 13h | Max: 1h 21m | Hits: 5%/306648

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stf Sequential Task Flow programming model

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

1 participant