LAMMPS 23Jun22 with GPU support

ウェブページ

https://www.lammps.org

バージョン

23Jun22

ビルド環境

  • Intel oneAPI Compiler Classic 2022.2.1
  • Intel MKL 2022.2.1
  • HPC-X 2.11 (Open MPI 4.1.4)
    • 実際のビルド時には HPC-X 2.13.1 (Open MPI 4.1.5)を利用。エラー回避のため runtime を HPC-X 2.11 に切り替え
    • 以下の手順では HPC-X 2.11 として記述
  • CUDA 11.6

ビルドに必要なファイル

  • lammps-stable.tar.gz
  • (一部ファイルは以下スクリプト中で取得)

ビルド手順

conda 環境

lammps 29Sep21 CPU 版で作成したものを流用。

lammps 本体

#!/bin/sh

VERSION=23Jun22
NAME=lammps-23Jun2022
INSTALL_PREFIX=/apl/lammps/2022-Jun23-CUDA

BASEDIR=/home/users/${USER}/Software/LAMMPS/${VERSION}
LAMMPS_TARBALL=${BASEDIR}/lammps-stable.tar.gz

WORKDIR=/gwork/users/${USER}/lammps-23Sep2022-cuda
LAMMPS_WORKDIR=${WORKDIR}/${NAME}

GPU_ARCH=sm_80
VMD_MOLFILE_INC=/home/users/${USER}/Software/VMD/1.9.4/vmd-1.9.4a57/plugins/include

PARALLEL=12

#------------------------------------------------------------------
umask 0022
export LANG=C
ulimit -s unlimited

module -s purge

# load intel compiler
. ~/intel/oneapi/compiler/2022.2.1/env/vars.sh

. /apl/lammps/2022-Jun23-CUDA/conda_init.sh

module -s load mkl/2022.2.1
module -s load cuda/11.6
module -s load openmpi/4.1.4-hpcx/intel2022.2.1

export CC=mpicc
export CXX=mpicxx
export FC=mpif90
export MPICC=mpicc
export MPICXX=mpicxx
export MPIFC=mpif90

cd ${WORKDIR}
if [ -d ${NAME} ]; then
  mv ${NAME} lammps_erase
  rm -rf lammps_erase &
fi

tar zxf ${LAMMPS_TARBALL}

cd ${NAME}
sed -i -e "s/xHost/march=core-avx2/" cmake/CMakeLists.txt
mkdir build && cd build

# Disabled PKGs:
# FFMPEG, ADIOS, MDI, VTK: noavail
# MSCG: gsl too old
# MESSAGE: ZeroMQ support not enabled
# QUIP: failed to build
# ML-HDNNP: failed to build
# KIM: CDDL is imcompatible with GPL
# LATTE: technical problem of cmake? (LAPACK and BLAS)
# NETCDF: to avoid EVP_KDF_ctrl error
# MPIIO: not maintained?

cmake ../cmake \
  -DLAMMPS_MACHINE=rccs-cuda \
  -DENABLE_TESTING=on \
  -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} \
  -DCMAKE_C_COMPILER=mpicc \
  -DCMAKE_CXX_COMPILER=mpicxx \
  -DCMAKE_Fortran_COMPILER=mpif90 \
  -DCMAKE_MPI_C_COMPILER=mpicc \
  -DCMAKE_MPI_CXX_COMPILER=mpicxx \
  -DCMAKE_MPI_Fortran_COMPILER=mpif90 \
  -DCMAKE_CXX_FLAGS_DEBUG="-Wall -Wextra -g" \
  -DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-Wall -Wextra -g -O2 -DNDEBUG" \
  -DCMAKE_CXX_FLAGS_RELEASE="-O3 -DNDEBUG" \
  -DCMAKE_Fortran_FLAGS_DEBUG="-Wall -Wextra -g" \
  -DCMAKE_Fortran_FLAGS_RELWITHDEBINFO="-Wall -Wextra -g -O2 -DNDEBUG" \
  -DCMAKE_Fortran_FLAGS_RELEASE="-O3 -DNDEBUG" \
  -DCMAKE_C_FLAGS_DEBUG="-Wall -Wextra -g" \
  -DCMAKE_C_FLAGS_RELWITHDEBINFO="-Wall -Wextra -g -O2 -DNDEBUG" \
  -DCMAKE_C_FLAGS_RELEASE="-O3 -DNDEBUG" \
  -DLAMMPS_EXCEPTIONS=on \
  -DBUILD_SHARED_LIBS=on \
  -DBUILD_TOOLS=on \
  -DBUILD_MPI=on \
  -DBUILD_OMP=on \
  -DFFT=MKL \
  -DFFT_SINGLE=on \
  -DFFT_MKL_THREADS=on \
  -DWITH_JPEG=yes \
  -DWITH_PNG=yes \
  -DWITH_GZIP=yes \
  -DPKG_ASPHERE=on \
  -DPKG_ATC=on \
  -DPKG_AWPMD=on \
  -DPKG_BOCS=on \
  -DPKG_BODY=on \
  -DPKG_BROWNIAN=on \
  -DPKG_CG-DNA=on \
  -DPKG_CG-SDK=on \
  -DPKG_CLASS2=on \
  -DPKG_COLLOID=on \
  -DPKG_COLVARS=on \
  -DPKG_COMPRESS=on \
  -DPKG_CORESHELL=on \
  -DPKG_DIELECTRIC=on \
  -DPKG_DIFFRACTION=on \
  -DPKG_DIPOLE=on \
  -DPKG_DPD-BASIC=on \
  -DPKG_DPD-MESO=on \
  -DPKG_DPD-REACT=on \
  -DPKG_DPD-SMOOTH=on \
  -DPKG_DRUDE=on \
  -DPKG_EFF=on \
  -DPKG_EXTRA-COMPUTE=on \
  -DPKG_EXTRA-DUMP=on \
  -DPKG_EXTRA-FIX=on \
  -DPKG_EXTRA-MOLECULE=on \
  -DPKG_EXTRA-PAIR=on \
  -DPKG_FEP=on \
  -DPKG_GPU=on \
  -DGPU_API=cuda \
  -DGPU_ARCH=${GPU_ARCH} \
  -DPKG_GRANULAR=on \
  -DPKG_H5MD=on \
  -DPKG_INTEL=on \
  -DPKG_INTERLAYER=on \
  -DPKG_KIM=off \
  -DDOWNLOAD_KIM=no \
  -DPKG_KOKKOS=off \
  -DPKG_KSPACE=on \
  -DPKG_LATBOLTZ=on \
  -DPKG_MACHDYN=on \
  -DDOWNLOAD_EIGEN3=on \
  -DPKG_MANIFOLD=on \
  -DPKG_MANYBODY=on \
  -DPKG_MC=on \
  -DPKG_MDI=off \
  -DPKG_MEAM=on \
  -DPKG_MESONT=on \
  -DPKG_MESSAGE=on \
  -DPKG_MGPT=on \
  -DPKG_MISC=on \
  -DPKG_ML-HDNNP=off \
  -DDOWNLOAD_N2P2=no \
  -DPKG_ML-IAP=on \
  -DPKG_ML-PACE=on \
  -DPKG_ML-QUIP=off \
  -DDOWNLOAD_QUIP=no \
  -DPKG_ML-RANN=on \
  -DPKG_ML-SNAP=on \
  -DPKG_MOFFF=on \
  -DPKG_MOLECULE=on \
  -DPKG_MOLFILE=on \
  -DMOLFILE_INCLUDE_DIR=${VMD_MOLFILE_INC} \
  -DPKG_MPIIO=off \
  -DPKG_MSCG=off \
  -DPKG_NETCDF=off \
  -DPKG_OPENMP=on \
  -DPKG_OPT=on \
  -DPKG_ORIENT=on \
  -DPKG_PERI=on \
  -DPKG_PHONON=on \
  -DPKG_PLUGIN=on \
  -DPKG_PLUMED=on \
  -DDOWNLOAD_PLUMED=yes \
  -DPKG_POEMS=on \
  -DPKG_PTM=on \
  -DPKG_PYTHON=on \
  -DPKG_QEQ=on \
  -DPKG_QMMM=on \
  -DPKG_QTB=on \
  -DPKG_REACTION=on \
  -DPKG_REAXFF=on \
  -DPKG_REPLICA=on \
  -DPKG_RIGID=on \
  -DPKG_SCAFACOS=on \
  -DDOWNLOAD_SCAFACOS=yes \
  -DPKG_SHOCK=on \
  -DPKG_SMTBQ=on \
  -DPKG_SPH=on \
  -DPKG_SPIN=on \
  -DPKG_SRD=on \
  -DPKG_TALLY=on \
  -DPKG_UEF=on \
  -DPKG_VORONOI=on \
  -DDOWNLOAD_VORO=yes \
  -DPKG_VTK=off \
  -DPKG_YAFF=on \
  -DBLAS_LIBRARIES="-qmkl" \
  -DCMAKE_BUILD_TYPE=Release

make VERBOSE=1 -j ${PARALLEL}

export OMP_NUM_THREADS=2

#make test # will put error...
make install

cp -a ../examples ${INSTALL_PREFIX}

cd ${INSTALL_PREFIX}
for f in etc/profile.d/*; do
  ln -s $f .
done

cd lib64
if [ -f liblammps_rccs-cuda.so ]; then
  ln -s liblammps_rccs-cuda.so liblammps.so
fi
if [ -f liblammps_rccs-cuda.so.0 ]; then
  ln -s liblammps_rccs-cuda.so.0 liblammps.so.0
fi

パッケージ

ASPHERE ATC AWPMD BOCS BODY BROWNIAN CG-DNA CG-SDK CLASS2 COLLOID COLVARS
COMPRESS CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO
DPD-REACT DPD-SMOOTH DRUDE EFF EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX EXTRA-MOLECULE
EXTRA-PAIR FEP GPU GRANULAR H5MD INTEL INTERLAYER KSPACE LATBOLTZ MACHDYN
MANIFOLD MANYBODY MC MEAM MESONT MGPT MISC ML-IAP ML-PACE ML-RANN ML-SNAP
MOFFF MOLECULE MOLFILE OPENMP OPT ORIENT PERI PHONON PLUGIN PLUMED POEMS
PTM PYTHON QEQ QMMM QTB REACTION REAXFF REPLICA RIGID SCAFACOS SHOCK
SMTBQ SPH SPIN SRD TALLY UEF VORONOI YAFF

テスト

(make test の部分のみ GPU 演算ノードで実行)
テストログのコピーを /apl/lammps/2022-Jun23-CUDA/Testing/ に置いています。

The following tests FAILED:
         11 - AtomStyles (Failed)
         42 - ComputeGlobal (Failed)
         94 - MolPairStyle:coul_diel (Failed)
        100 - MolPairStyle:coul_shield (Failed)
        102 - MolPairStyle:coul_slater_long (Failed)
        137 - MolPairStyle:lj_class2_soft (Failed)
        152 - MolPairStyle:lj_cut_soft (Failed)
        158 - MolPairStyle:lj_expand_coul_long (Failed)
        171 - MolPairStyle:lj_sdk_coul_long (Failed)
        172 - MolPairStyle:lj_sdk_coul_table (Failed)
        176 - MolPairStyle:lj_switch3_coulgauss_long (Failed)
        200 - MolPairStyle:tip4p_long_soft (Failed)
        203 - MolPairStyle:wf_cut (Failed)
        211 - AtomicPairStyle:buck_coul_cut_qeq_point (Failed)
        212 - AtomicPairStyle:buck_coul_cut_qeq_shielded (Failed)
        229 - AtomicPairStyle:edip (Failed)
        236 - AtomicPairStyle:meam (Failed)
        237 - AtomicPairStyle:meam_spline (Failed)
        238 - AtomicPairStyle:meam_sw_spline (Failed)
        241 - AtomicPairStyle:reaxff-acks2 (Failed)
        242 - AtomicPairStyle:reaxff-acks2_efield (Failed)
        243 - AtomicPairStyle:reaxff (Failed)
        244 - AtomicPairStyle:reaxff_lgvdw (Failed)
        245 - AtomicPairStyle:reaxff_noqeq (Failed)
        246 - AtomicPairStyle:reaxff_tabulate (Failed)
        247 - AtomicPairStyle:reaxff_tabulate_flag (Failed)
        264 - ManybodyPairStyle:comb (Failed)
        272 - ManybodyPairStyle:ilp-graphene-hbn (Failed)
        273 - ManybodyPairStyle:ilp-graphene-hbn_notaper (Failed)
        277 - ManybodyPairStyle:lcbop (Failed)
        286 - ManybodyPairStyle:pace_product (Failed)
        287 - ManybodyPairStyle:pace_recursive (Failed)
        299 - ManybodyPairStyle:tersoff (Failed)
        304 - ManybodyPairStyle:tersoff_shift (Failed)
        314 - BondStyle:gaussian (Failed)
        357 - KSpaceStyle:ewald_tri (Failed)
        359 - KSpaceStyle:pppm_ad (Failed)
        360 - KSpaceStyle:pppm_cg (Failed)
        362 - KSpaceStyle:pppm_cg_tiled (Failed)
        369 - KSpaceStyle:pppm_disp_tip4p (Failed)
        377 - KSpaceStyle:pppm_tip4p (Failed)
        382 - KSpaceStyle:scafacos_direct (Failed)
        383 - KSpaceStyle:scafacos_ewald (Failed)
        384 - KSpaceStyle:scafacos_fmm (Failed)
        385 - KSpaceStyle:scafacos_fmm_tuned (Failed)
        386 - KSpaceStyle:scafacos_p2nfft (Failed)
        387 - FixTimestep:adapt_coul (Failed)
        390 - FixTimestep:addforce_const (Failed)
        392 - FixTimestep:addtorque_const (Failed)
        411 - FixTimestep:nph (Failed)
        412 - FixTimestep:nph_sphere (Failed)
        414 - FixTimestep:npt_iso (Failed)
        415 - FixTimestep:npt_sphere_aniso (Failed)
        416 - FixTimestep:npt_sphere_iso (Failed)
        440 - FixTimestep:rigid_npt_small (Failed)
        454 - FixTimestep:smd_couple (Failed)
        462 - FixTimestep:temp_csld (Failed)
        483 - DihedralStyle:table_cut_linear (Failed)
        485 - DihedralStyle:table_linear (Failed)
        486 - DihedralStyle:table_spline (Failed)
        496 - ImproperStyle:inversion_harmonic (Failed)

  • ほとんどは軽微な数値エラーと、インテルコンパイラ利用時の lattice 関連。大筋では問題無いと判断。

メモ

  • ccfep でビルド。GPU 演算ノードにてテスト。
  • HPC-X 2.13.1 の実行環境利用時は並列計算で通信エラー。HPC-X 2.11 の実行環境に切り替えることでエラーは解消。
  • NETCDF を on にするとビルドに失敗するため、今回は回避。システム側のライブラリの問題の可能性が高い。
  • システムの python 3.6 を使っていれば、conda 環境は不要であった可能性が高い。また、NETCDF も on にできたかもしれない。