Skip to content

Enabling GPU direct communication for Cray-MPICH#724

Open
JPRichings wants to merge 6 commits into
QuEST-Kit:develfrom
JPRichings:devel
Open

Enabling GPU direct communication for Cray-MPICH#724
JPRichings wants to merge 6 commits into
QuEST-Kit:develfrom
JPRichings:devel

Conversation

@JPRichings
Copy link
Copy Markdown
Contributor

This needs testing but here is an initial suggestion for checking support for GPU enabled MPI in CRAY_MPICH this should enable direct inter GPU comms for a wider range of systems in the HPC space that don't always support OpenMPI for there scale out network.

Plan to test on MI210 nodes on ARCHER2 when they are available.

@JPRichings
Copy link
Copy Markdown
Contributor Author

JPRichings commented Apr 25, 2026

Should have updated this before:

Hardware: 1 node 4 gpus Mi210 ARCHER2

before (no MPI direct comms):

QFT run time: 9.49068s
Total run time: 10.2344s

after (with MPI direct comms):

QFT run time: 2.29135s
Total run time: 2.92993s

I think it might be worth switching on GPU direct comms... ;)

@otbrown otbrown changed the title initial commit enabling GPU direct communictation for Cray-MPICH initial commit enabling GPU direct communication for Cray-MPICH Apr 27, 2026
@JPRichings
Copy link
Copy Markdown
Contributor Author

JPRichings commented Apr 28, 2026

Test all failing due to:

[ 80%] Building CXX object _deps/catch2-build/src/CMakeFiles/Catch2.dir/catch2/matchers/catch_matchers_predicate.cpp.o
/Users/runner/work/QuEST/QuEST/quest/src/comm/comm_config.cpp:75:5: error: use of undeclared identifier 'MPI_Get_library_version'
   75 |     MPI_Get_library_version(version_string, resultlen);
      |     ^
1 error generated.
  • I will look to add macro guard around this mpi call to fix non-mpi compilation.
  • need to set some sensible default buffer sizes
  • Need to remove the debugging messages for the code

@JPRichings
Copy link
Copy Markdown
Contributor Author

need to return NONE for comm_whichMpi and false for comm_set_isMpiGpuAware

Comment thread quest/include/environment.h Outdated
@JPRichings JPRichings changed the title initial commit enabling GPU direct communication for Cray-MPICH Enabling GPU direct communication for Cray-MPICH May 28, 2026
@otbrown otbrown marked this pull request as ready for review May 30, 2026 18:37
@TysonRayJones
Copy link
Copy Markdown
Member

Hmm I think this diff is unnecessarily convoluted. It seeks to extend the systems for which we can detect GPU-awareness, but creates a comm_config.cpp dependency on an API env function:

bool comm_isMpiGpuAware() {
	return getQuESTEnv().isMPIGPUAware;
}

and performs brittle string processing to establish the MPI version...

enum Mpi_version {NONE, OPENMPI, CRAYMPICH};

int comm_whichMpi() {

    #ifdef COMPILE_MPI
        char version_string[MPI_MAX_LIBRARY_VERSION_STRING];
    #else
        char version_string[];
    #endif

    int resultlen[] = {0};

    #ifdef COMPILE_MPI
        MPI_Get_library_version(version_string, resultlen);
    #endif
    enum Mpi_version version = NONE;

    // Check if Openmpi used
    #ifdef OPEN_MPI
        version = OPENMPI;
    #endif

    // Check if Cray MPI used 
	
    const char* cray_string = "CRAY MPICH";

    std::string v_string = version_string;

    if (v_string.find(cray_string) != string::npos) {
        version = CRAYMPICH;
    }

    return version;

}

just so that it can merely look for a single env variable, MPICH_GPU_SUPPORTED_ENABLED:

bool comm_set_isMpiGpuAware() {

   int mpi_lib = comm_whichMpi();
  
    if(OPENMPI==mpi_lib) {
        // definitely not GPU-aware if compiler declares it is not
        #if defined(MPIX_CUDA_AWARE_SUPPORT) && ! MPIX_CUDA_AWARE_SUPPORT
            return false;
        #endif

        // check CUDA-awareness at run-time if we know it's principally supported
        #if defined(MPIX_CUDA_AWARE_SUPPORT)
            return (bool) MPIX_Query_cuda_support();
	#endif
    }

    if(CRAYMPICH==mpi_lib) { 
          
       const char* var = std::getenv("MPICH_GPU_SUPPORT_ENABLED");
       
       return (bool) var;
    }

    // if we can't ascertain CUDA-awareness, just assume no to avoid seg-fault
    return false;
}

In my understanding, this is all unnecessary because the check of "is GPU aware" is always fail safe. If we cannot detect an affirmative "yes", then we default to "no". Checking the MPI versions is redundant, given we can just check for the macros directly - when they are not defined, the check is null (NOT false) and we safely proceed.

The original comm_isMpiGpuAware function was:

bool comm_isMpiGpuAware() {

    // definitely not GPU-aware if compiler declares it is not
    #if defined(MPIX_CUDA_AWARE_SUPPORT) && ! MPIX_CUDA_AWARE_SUPPORT
        return false;
    #endif

    // check CUDA-awareness at run-time if we know it's principally supported
    #if defined(MPIX_CUDA_AWARE_SUPPORT)
        return (bool) MPIX_Query_cuda_support();
    #endif

    // if we can't ascertain CUDA-awareness, just assume no to avoid seg-fault
    return false;
}

If my understanding is correct, to support CRAY MPICH, we just need to introduce 3 lines before the return false, like:

    // check whether an MPICH env-var indicates support (we assume it never lies!)
    static const char* var = std::getenv("MPICH_GPU_SUPPORT_ENABLED"); // load once
    if (var && std::atoi(var))
        return true;

The only functional difference between that and this PR's diff is that the PR will never try to find the MPICH env-var when it knows OpenMPI is used. That doesn't add any robustness (the OpenMPI checks were safe, since undefined macros in C default to falsiness), except if we believe OpenMPI systems might have an erroneous MPICH_GPU_SUPPORT_ENABLED=1 env-var. And of course that is absurd, and much less likely than the true danger inherent to both these solutions; that an MPICH user will have somehow erroneously overriden MPICH_GPU_SUPPORT_ENABLED=1.

(Note too this branch doesn't compile, beyond just the isMPIGPUAware typo; MPI_MAX_LIBRARY_VERSION_STRING is not defined in Clang's MPI)

It seems best to close the PR and create a fresh one with the 3 lines mentioned above. Have I understood all this right?

@TysonRayJones
Copy link
Copy Markdown
Member

TysonRayJones commented May 31, 2026

Note I like the idea of attaching isMpiGpuAware to the environment at runtime. We should keep that, but then of course report it within reportQuESTEnv() and getEnvironmentString() so users can see it! I can add all that in a new PR

(Also note to self; rename getEnvironmentString() to getQuESTEnvString() - iirc this function exists for a pyQuEST hook, and @rrmeister can tolerate an API break ;) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants