[AIMIGRAPHX-885][AIMIGRAPHX-987] Use External Stream Contexts#4775
[AIMIGRAPHX-885][AIMIGRAPHX-987] Use External Stream Contexts#4775TedThemistokleous wants to merge 38 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #4775 +/- ##
===========================================
- Coverage 92.88% 92.65% -0.23%
===========================================
Files 587 587
Lines 30348 30356 +8
===========================================
- Hits 28187 28124 -63
- Misses 2161 2232 +71
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Enables MIGraphX GPU async execution to run directly on a caller-provided HIP stream (external stream contexts) to reduce internal synchronization/stalls, and adds GPU tests to validate external-stream behavior and fallback behavior.
Changes:
- Add external-stream support in
gpu::context/hip_device::stream(override stream used by the context during async eval). - Adjust async synchronization logic (
wait_for/finish_on) to avoid creating/using an extra internal stream when an external stream is provided. - Add a comprehensive new GPU test suite covering external stream override, async eval behavior, and fallback paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/targets/gpu/include/migraphx/gpu/context.hpp |
Adds external stream override plumbing and modifies async sync behavior to use caller stream. |
test/gpu/external_stream.cpp |
Adds new GPU tests for external stream override, async eval correctness, and state cleanup expectations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bdevorem
left a comment
There was a problem hiding this comment.
couple questions, thanks Ted
d2cfb6b to
563c78b
Compare
bdevorem
left a comment
There was a problem hiding this comment.
I think a rebase/merge will solve the CI problems. lgtm otherwise
pfultz2
left a comment
There was a problem hiding this comment.
I dont think we should do this in the wait_for and finish_on functions as it could change the semantics of the function. Instead we should add a use_queue method to the context interface and use that directly.
Sure rebased this off develop |
7b92f18 to
a164acb
Compare
Okay let me add this. This is similar and just do the create/set in the use_queue or use_external() thread?
So is the idea then run_async() -> bind the stream if its not the null/default stream? otherwise we just create an internal stream for regular run()? |
Regressions detected 🔴 * No develop baseline was found for this PR's branch point; compared against the latest available develop run instead. |
|
|
Added changes based on Paul's comments so that we odn't modify wait_for, finish_on and just use a use_queue and set_queue_context. |
CharlieL7
left a comment
There was a problem hiding this comment.
There's the tools/include/context.hpp file that looks like the file that generates the type erasure interface. The use_queue interface should probably be generated from there.
Can easily check has_value() as nullptr means default stream here.
|
is the codecov report up to date? might want to handle that before merging too |
…tr and non null stream
…sues with typing.
| // save time?" (which may itself be empty, meaning no binding). A | ||
| // single-level optional cannot distinguish "no save" from "saved | ||
| // empty binding" now that external_stream is itself an optional. | ||
| std::optional<std::optional<hipStream_t>> previous_stream{}; |
There was a problem hiding this comment.
Remove previous_stream. The previous stream is always s.
Avoid repeat in setting raw stream handles for rocblas/miopen
…ce we've removed set_external_stream. Use set_queue instead
8c6d7c3 to
f0a9151
Compare
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Motivation
Customer workload seeing some stalls during inference. This allows us to use the customer hipSteam passed to context via run_async so that we don't need to internally sync and manage a thread within MIGraphX. This allows the synchronization to be handled externally.
As an added benefit if not external thread is used we should fall back to the old fork_join run on the GPU where we internally create a stream to sync events onto.
Technical Details
Adds additional conditions to the wait_for , finish_on calls in context.cpp such that we avoid new stream creation for async runs while also simplifying much of the code.
Test cases have been added for this to ensure we don't break existing functionality.
Additional code added to ensure we set external libraries like BLAS and MIOPEN to use the default stream on clear
Changelog Category
Add a
CHANGELOG.mdentry for any option other thanNot Applicable