From 62e367cc17db5b24cafd64cb04c40e72df866d41 Mon Sep 17 00:00:00 2001 From: Thijs Vogels Date: Tue, 26 May 2026 12:57:24 +0000 Subject: [PATCH] docs: note CUDA_VISIBLE_DEVICES workaround for multi-GPU systems Importing gpu4pyscf allocates memory on every visible CUDA device, which conflicts with PyTorch and with other processes sharing those GPUs (e.g. in MPI-parallel workloads). Document the CUDA_VISIBLE_DEVICES workaround and link to the upstream tracking issue. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/README.md b/README.md index f0ec1d0..ff5e117 100644 --- a/README.md +++ b/README.md @@ -123,6 +123,26 @@ ks = SkalaKS(mol, xc="skala-1.1") ks.kernel() ``` +### Known issue: multiple visible GPUs + +Skala uses a single GPU, but importing `gpu4pyscf` allocates memory on **every** +visible CUDA device. This can conflict with PyTorch and with other processes +sharing those GPUs (e.g. in MPI-parallel workloads). + +Restrict CUDA to one device **before** launching Python: + +```bash +CUDA_VISIBLE_DEVICES=0 python my_script.py +``` + +For MPI-parallel runs, assign one GPU per local rank: + +```bash +mpirun -np 4 bash -c 'CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK python my_script.py' +``` + +Tracked upstream at [pyscf/gpu4pyscf#435](https://github.com/pyscf/gpu4pyscf/issues/435). + ## Getting started: ASE calculator Skala also provides an [ASE](https://wiki.fysik.dtu.dk/ase/) calculator for energy, force, and geometry optimization workflows: