{2025.06}[foss-2024a] Siesta 5.4.2 CUDA 12.6.0#1506
Conversation
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/amd/zen4,accel=nvidia/cc90 |
|
New job on instance
|
|
test step is failing: It's not fully clear to me why though, looks like it could be a segfault ( @casparvl Any ideas? |
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace,accel=nvidia/cc90 |
|
New job on instance
|
|
I can see the same error on local build for aarch64/nvidia/grace/cc90 |
@TopRichard Is that manually, with (just) I also tried on a Grace Hopper node manually with The |
That is manually using the |
|
The problem did not occur when testing with the build bot @ JSC, so the mystery continues... Hoping to hear back from @casparvl on the test @ SURF |
There seems to be a weird issue with the Surf bot, the job ran out of memory: The same happened to a GROMACS+CUDA build job in #1482 and a LAMMPS+CUDA build job in #1461 (comment). |
|
The issue with the Surf bot has been resolved, let's try again bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/amd/zen4,accel=nvidia/cc90 |
|
New job on instance
|
|
Pointing to the sif file now in repos.cfg. Let's see if that also fixes UGent. |
|
@casparvl this is why the test-suite is failing at UGent |
|
Stupid wrong repo |
|
New job on instance
|
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/amd/zen3,accel=nvidia/cc80 |
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 |
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=nvidia/cc70 |
|
New job on instance
|
|
New job on instance
|
Unable to download or merge changes between the source branch and the destination branch. |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
Unable to download or merge changes between the source branch and the destination branch. |
|
|
@AnthoniAlcaraz We have now compiled all the native GPU builds for siesta. We ran into trouble with the crosscompiling. We see that the tests are failing. Do you know if those tests are expecting their to be an actual GPU available? |
Hi Lara, yes, if an NVIDIA GPU is detected on the node, the tests should run without issues. The failures you're seeing are expected when building on a node without a GPU available and the building and compilation include the usage of a GPU. can we see the content of |
|
yes , here they are. They all fail because it does a cuda_getdevicecount. So I'm guessing we should skip those ctests on the crosscompiled nodes. Maybe something like with did for lammps for everything but the native targets. https://github.com/EESSI/software-layer-scripts/blob/9aa5e0a9abf247d4bd16e720532e05a1fee4d9e7/eb_hooks.py#L1559 |
|
@laraPPr It definitely makes sense to me to skip any GPU tests if they expect to actually find a GPU (rather than passing or being skipped automatically). So we can basically run |
|
Rather than poking |
Adding the latest SIESTA version with CUDA