Quantum Espresso 7.3 with GPU

Webpage

https://www.quantum-espresso.org/

Version

7.3

Build Environment

  • NVIDIA HPC SDK 23.5 (nompi version)
  • Intel MKL 2023.1.0
  • OpenMPI 4.1.6 (CUDA-aware; built with nvhpc 23.5)

Files Required

  • qe-7.3-ReleasePack.tar.gz
  • .gitmodules
    • This is missing in release pack but may be necessary to build W90 etc.

Build Procedure

#!/bin/sh

QE_VERSION=7.3
BASEDIR=/home/users/${USER}/Software/QE/${QE_VERSION}
TARBALL=${BASEDIR}/qe-${QE_VERSION}-ReleasePack.tar.gz
GITMODULES=${BASEDIR}/.gitmodules

INSTDIR=/apl/qe/7.3-gpu
CUDA_HOME=/apl/nvhpc/23.5/Linux_x86_64/23.5/cuda
PARALLEL=12

# --------------------------------------------------------------------
umask 0022

module -s purge
module -s load nvhpc/23.5-nompi
module -s load openmpi/4.1.6/nv23

export LANG=C
export LC_ALL=C
ulimit -s unlimited

if [ ! -d ${INSTDIR} ]; then
 mkdir -p ${INSTDIR}
fi

cd ${INSTDIR}
if [ -d qe-${QE_VERSION} ]; then
 mv qe-${QE_VERSION} qe-erase
 rm -rf qe-erase &
fi

tar zxf ${TARBALL}
mv qe-${QE_VERSION}/* .
mv qe-${QE_VERSION}/.[a-z]* .
rmdir qe-${QE_VERSION}

sed -i -e "s/wget -O/wget –-trust-server-names -O/" \
      -e "s/curl -o/curl -L -o/" test-suite/check_pseudo.sh

export MPIF90=mpif90
export MPIFC=mpif90
export MPIF77=mpif90
export MPICC=mpicc
export MPICXX=mpicxx

cp ${GITMODULES} .
rm -rf external/wannier90
mkdir -p external/wannier90

sed -i -e '/external\/wannier90/s/lib/wannier lib/' install/plugins_makefile

FC=nvfortran F90=nvfortran F77=nvfortran CC=nvc CXX=nvc++ \
   ./configure --enable-parallel \
               --enable-openmp \
               --with-scalapack=no \
               --with-cuda=${CUDA_HOME} \
               --with-cuda-cc=80 \
               --with-cuda-runtime=12.1 \
               --with-cuda-mpi=yes

for i in w90; do
  echo "==== $i ===="
  make $i
done

# pwall(pw neb ph pp pwcond acfdt) cp ld1 tddfpt hp xspectra gwl
echo "==== all ===="
make -j${PARALLEL} all

#for i in want; do
#  echo "==== $i ===="
#  make $i
#done

# gipaw for QE 7.3 doesn't seem to be available
# d3q depends on old version of PH code? (setlocq, setlocq_coul)

for i in all_currents epw couple kcw gwl gui; do
  echo "==== $i ===="
  make -j${PARALLEL} $i
done

#for i in yambo; do
#  echo "==== $i ===="
#  make $i
#done

cd test-suite
make pseudo

exit 0

Tests

Following script was executed on ccgpu (A30 equipped).

#!/bin/sh

QE_VERSION=7.3
BASEDIR=/home/users/${USER}/Software/QE/${QE_VERSION}
TARBALL=${BASEDIR}/qe-${QE_VERSION}-ReleasePack.tar.gz
GITMODULES=${BASEDIR}/.gitmodules

INSTDIR=/apl/qe/7.3-gpu
CUDA_HOME=/apl/nvhpc/23.5/Linux_x86_64/23.5/cuda
PARALLEL=12

# --------------------------------------------------------------------
umask 0022

module -s purge
module -s load nvhpc/23.5-nompi
module -s load openmpi/4.1.6/nv23

export MPIF90=mpif90
export MPIFC=mpif90
export MPIF77=mpif90
export MPICC=mpicc
export MPICXX=mpicxx

cd ${INSTDIR}/test-suite

export OMP_NUM_THREADS=1
make run-tests-pw NPROCS=1
make run-tests-cp NPROCS=1
make run-tests-ph NPROCS=1
make run-tests-epw NPROCS=1
make run-tests-hp NPROCS=1
make run-tests-tddfpt NPROCS=1
make run-tests-kcw NPROCS=1
make run-tests-all_currents NPROCS=1
make run-tests-pp NPROCS=1
make run-tests-zg NPROCS=1
#make run-tests-xsd-pw NPROCS=1
make clean
export OMP_NUM_THREADS=2
make run-tests-pw NPROCS=4
make run-tests-cp NPROCS=4
make run-tests-ph NPROCS=4
make run-tests-epw NPROCS=4
make run-tests-hp NPROCS=4
make run-tests-tddfpt NPROCS=4
make run-tests-kcw NPROCS=4
make run-tests-all_currents NPROCS=4
make run-tests-pp NPROCS=4
make run-tests-zg NPROCS=4
#make run-tests-xsd-pw NPROCS=4
cd ..

Test result (serial)

pw: 243 out of 246 tests passed (1 skipped).

  • pw_noncolin - noncolin-rmm.in: **FAILED**.
  • pw_scf - scf-rmm-k.in: **FAILED**.
  • pw_scf - scf-rmm-paro-k.in: **FAILED**.

cp: 27 out of 27 tests passed (8 skipped).
ph: 37 out of 62 tests passed.
epw: 55 out of 113 tests passed (22 skipped).
hp: 29 out of 41 tests passed.
tddfpt: 9 out of 9 tests passed.
kcw: 5 out of 11 tests passed.
all_currents: 0 out of 10 tests passed.
pp: 2 out of 2 tests passed.
zq: 0 out of 1 test passed.

Test result (parallel)

pw: 241 out of 246 tests passed (1 skipped).

  • pw_noncolin - noncolin-rmm.in: **FAILED**.
  • pw_scf - scf-rmm-k.in: **FAILED**.
  • pw_scf - scf-rmm-paro-k.in: **FAILED**.
  • pw_workflow_exx_nscf - uspp-k-restart-1.in (arg(s): 1): **FAILED**.
  • pw_workflow_exx_nscf - uspp-k-restart-2.in (arg(s): 2): **FAILED**.

cp: 27 out of 27 tests passed (8 skipped).
ph: 37 out of 62 tests passed.
epw: 55 out of 113 tests passed (22 skipped).
hp: 29 out of 41 tests passed.
tddfpt: 8 out of 9 tests passed.

  • tddfpt_magnons_fe - Fe.tddfpt_pp_magnons.in (arg(s): 7): **FAILED**.

kcw: 5 out of 11 tests passed.
all_currents: 0 out of 10 tests passed.
pp: 2 out of 2 tests passed.
zq: 0 out of 1 test passed.

Notes

  • Please also check notes of CPU version.
  • There are no remarkable differences in test results and performance (of PW) between nvhpc 23.5 and 23.9.
  • WANT and YAMBO can't be built with this setting due to the syntax errors.