Add gpu tag for ROCm-LLVM and refactor method#228
Conversation
| add_gpu_property = '' | ||
|
|
||
| # If none of the gpu packages are in the easyconfig, do not process further | ||
| dep_names = {dep[0] for dep in ec_dict['dependencies']} |
There was a problem hiding this comment.
Have you tested this? I wonder if ROCm-LLVM is the toolchain if it actually appears in the dependencies?
There was a problem hiding this comment.
I doubt it. For one of the other hooks (can't remember which) where we searched some dep list with CUDA / cuDNN in it, I added logic to also check the toolchain instead - and then just trigger it if it's rocm-compilers, rompi,... etc. This should probably do something similar. @zerefwayne search for rocm-compilers in eb_hooks.py and you'll probably find my logic in the other hook.
There was a problem hiding this comment.
It would be better if we can identify ROCm-LLVM explicitly so that we can store the version in an environment variable. That will be useful later in Lmod when we are checking if the version is supported by the driver.
There was a problem hiding this comment.
I don't know if you have a toolchain instance at this point but if you did you could use it to find the ROCm-LLVM versionsuffix
ec.toolchain.dependencies()
|
What I'm trying to do in #231 is also relevant here |
| if rocm_llvm_dep is not None: | ||
| add_gpu_property = 'add_property("arch","gpu")' | ||
| versionsuffix = rocm_llvm_dep[2] if len(rocm_llvm_dep) > 2 else '' | ||
| pkg_versions['ROCm-LLVM'] = rocm_llvm_dep[1] + versionsuffix |
There was a problem hiding this comment.
For this, I think it is not more interesting to store the ROCm version as later we need something that we can compare with output we can get from rocm_smi (like we do for nvidia_smi and the CUDA version)
| pkg_versions['ROCm-LLVM'] = rocm_llvm_dep[1] + versionsuffix | |
| rocm_prefix = "-ROCm-" | |
| if versionsuffix.startswith(rocm_prefix): | |
| rocm_version = versionsuffix[len(rocm_prefix):] | |
| else: | |
| raise EasyBuildError(f"Invalid format for ROCm versionssuffix: {versionsuffix}") | |
| pkg_versions['ROCm'] = rocm_version |
Easyconfigs which include ROCm-LLVM as a dependency should also be tagged with
gpu. It doesn't need to be dropped to a build dependency as it is redistributable (unlike CUDA).The loop seems to iterate over dependencies twice, it can be simplified to one pass.