Enabling GPU direct communication for Cray-MPICH#724
Conversation
|
Should have updated this before: Hardware: 1 node 4 gpus Mi210 ARCHER2 before (no MPI direct comms): QFT run time: 9.49068s after (with MPI direct comms): QFT run time: 2.29135s I think it might be worth switching on GPU direct comms... ;) |
|
Test all failing due to:
|
|
need to return NONE for comm_whichMpi and false for comm_set_isMpiGpuAware |
|
Hmm I think this diff is unnecessarily convoluted. It seeks to extend the systems for which we can detect GPU-awareness, but creates a bool comm_isMpiGpuAware() {
return getQuESTEnv().isMPIGPUAware;
}and performs brittle string processing to establish the MPI version... enum Mpi_version {NONE, OPENMPI, CRAYMPICH};
int comm_whichMpi() {
#ifdef COMPILE_MPI
char version_string[MPI_MAX_LIBRARY_VERSION_STRING];
#else
char version_string[];
#endif
int resultlen[] = {0};
#ifdef COMPILE_MPI
MPI_Get_library_version(version_string, resultlen);
#endif
enum Mpi_version version = NONE;
// Check if Openmpi used
#ifdef OPEN_MPI
version = OPENMPI;
#endif
// Check if Cray MPI used
const char* cray_string = "CRAY MPICH";
std::string v_string = version_string;
if (v_string.find(cray_string) != string::npos) {
version = CRAYMPICH;
}
return version;
}just so that it can merely look for a single env variable, bool comm_set_isMpiGpuAware() {
int mpi_lib = comm_whichMpi();
if(OPENMPI==mpi_lib) {
// definitely not GPU-aware if compiler declares it is not
#if defined(MPIX_CUDA_AWARE_SUPPORT) && ! MPIX_CUDA_AWARE_SUPPORT
return false;
#endif
// check CUDA-awareness at run-time if we know it's principally supported
#if defined(MPIX_CUDA_AWARE_SUPPORT)
return (bool) MPIX_Query_cuda_support();
#endif
}
if(CRAYMPICH==mpi_lib) {
const char* var = std::getenv("MPICH_GPU_SUPPORT_ENABLED");
return (bool) var;
}
// if we can't ascertain CUDA-awareness, just assume no to avoid seg-fault
return false;
}In my understanding, this is all unnecessary because the check of "is GPU aware" is always fail safe. If we cannot detect an affirmative "yes", then we default to "no". Checking the MPI versions is redundant, given we can just check for the macros directly - when they are not defined, the check is null (NOT false) and we safely proceed. The original bool comm_isMpiGpuAware() {
// definitely not GPU-aware if compiler declares it is not
#if defined(MPIX_CUDA_AWARE_SUPPORT) && ! MPIX_CUDA_AWARE_SUPPORT
return false;
#endif
// check CUDA-awareness at run-time if we know it's principally supported
#if defined(MPIX_CUDA_AWARE_SUPPORT)
return (bool) MPIX_Query_cuda_support();
#endif
// if we can't ascertain CUDA-awareness, just assume no to avoid seg-fault
return false;
}If my understanding is correct, to support CRAY MPICH, we just need to introduce 3 lines before the // check whether an MPICH env-var indicates support (we assume it never lies!)
static const char* var = std::getenv("MPICH_GPU_SUPPORT_ENABLED"); // load once
if (var && std::atoi(var))
return true;The only functional difference between that and this PR's diff is that the PR will never try to find the MPICH env-var when it knows OpenMPI is used. That doesn't add any robustness (the OpenMPI checks were safe, since undefined macros in C default to falsiness), except if we believe OpenMPI systems might have an erroneous (Note too this branch doesn't compile, beyond just the It seems best to close the PR and create a fresh one with the 3 lines mentioned above. Have I understood all this right? |
|
Note I like the idea of attaching (Also note to self; rename |
This needs testing but here is an initial suggestion for checking support for GPU enabled MPI in CRAY_MPICH this should enable direct inter GPU comms for a wider range of systems in the HPC space that don't always support OpenMPI for there scale out network.
Plan to test on MI210 nodes on ARCHER2 when they are available.