Skip to content

Cleanup of "Set gpu tbp"#767

Open
TysonRayJones wants to merge 35 commits into
set_gpu_tpbfrom
gpu-tpb-cleanup
Open

Cleanup of "Set gpu tbp"#767
TysonRayJones wants to merge 35 commits into
set_gpu_tpbfrom
gpu-tpb-cleanup

Conversation

@TysonRayJones
Copy link
Copy Markdown
Member

No description provided.

These exemptions for communicator NULL-ness enable an error message to reach the user even then the user has called MPI_Init themselves but then triggered a validation error before the communicator could be set
for consistency with e.g. env.isMultithreaded. The user arg to e.g. initQuESTEnv() is kept as "userOwnsMpi" for a very superficial consistency with e.g. "useMultithreaded" :^)
by just using extern. This is terrible and inadvisable, but offers an easier understanding of the software architecture (and so is easier to fix correctly) than the previous "macros change which signatures this header exposes" design.

Also, we removed the unnecessary avoiding of defining comm_setMpiComm when SUBCOMM was not defined, which made the architecture even more confusing. Now, SUBCOMM only influences the contents of subcommunicator.cpp and subcommunicator.hpp. Simple!
although I suspect this is a poor choice of name. The logic should always be considered "compiled" when MPI is known, and we choose instead whether to "expose" the MPI signature to the users
without triggering an internal error
replacing the original internal error. Note that all of the other MPI functions between comm_config.cpp and comm_subroutines.cpp are unguarded; we should create a macro around them
- added comm_isActive to indicate whether QuEST is using MPI (which is distinct from whether MPI itself is initialised)
- renamed comm_isInit to comm_isMpiInit, since it queries MPI directly/globally, and when true, does not indicate whether QuEST is actually using MPI
- record isMpiUserOwned within comm_config.cpp, since failed-validation must not kill user-owned MPI, and it must know user-ownership before QuESTEnv succeeds/records it (because validation can fail DURING QuESTEnv initialisation)
- explicitly divided (through doc) comm_config.cpp into things which query MPI globally, and thinks which query only QuEST's MPI env/communicator
-
as found by Codex! All hail our new overlords
@TysonRayJones TysonRayJones marked this pull request as draft May 31, 2026 22:34
@TysonRayJones
Copy link
Copy Markdown
Member Author

Above has no changes; just merging (with conflict resolution) of #765, since it contains the new experimental.h

Comment on lines +304 to +308
// why the loops below are explicitly compile-time unrolled. Beware that when
// numThreadsPerBlock is increased from 128, this kernel will still behave
// correctly, but privateCache below will spill over into local memory at a
// performance penalty for NumTargs <= 5, with spillage occurring for fewer
// NumTargs as numThreadsPerBlock increases.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@otbrown just flagging this!

(existing checks were internal errors, and incomplete)
@TysonRayJones TysonRayJones mentioned this pull request Jun 1, 2026
7 tasks
Comment thread quest/include/environment.h Outdated
Woof that's a lot of boilerplate - but at least we have the safest environment variables in the business! 😅
since it should be added in a separate PR with the other intendedly programmatically-accessible fields. I know in my heart of hearts that if I left isHipCompiled attached, the other fields would never follow hehe
@TysonRayJones TysonRayJones marked this pull request as ready for review June 1, 2026 06:24
@TysonRayJones TysonRayJones requested a review from otbrown June 1, 2026 06:24
@TysonRayJones
Copy link
Copy Markdown
Member Author

TysonRayJones commented Jun 1, 2026

@otbrown Note this diff is polluted when a merge from #765, so that I could make use of experimental.h. I wouldn't bother looking at the diff anyway, it's a doozy! Instead, I flag the main changes:

  • moved setQuESTNumGpuThreadsPerBlock() (and the getter) out of environment.cpp, and into experimental.cpp (in the future, they're appropriate to move to debug.cpp)
  • replaced the cmake QUEST_DEFAULT_NUM_THREADS_PER_BLOCK variable with an environment variable QUEST_DEFAULT_NUM_GPU_THREADS_PER_BLOCK (to be consistent with e.g. QUEST_DEFAULT_VALIDATION_EPSILON)
  • added user-input validation
  • documented setQuESTNumGpuThreadsPerBlock (and the new env-var)
  • replaced the unit tests

Let me know if that's all fine, and I'll do the merging (since I'll need to merge upstream first then solve conflicts)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant