Known Issues
(Last update: Apr 10, 2024)
Known Issues
- Several problems have been reported for HPC-X 2.13.1 (openmpi/4.1.5-hpcx). Please use other versions of Open MPI or HPC-X.
- For binaries built with HPC-X 2.13.1 (openmpi/4.1.5-hpcx), please use HPC-X 2.11 (openmpi/4.1.4-hpcx) for runtime library.
- Molpro has problem with Intel MPI 2021.7.1.
- Intel MPI 2021.5.1, 2021.8.0, and 2021.9.0 are free from this issue. It hangs only when runtime library of Intel MPI 2021.7.1 is employed.
- Molpro built with Open MPI sometimes hangs when disk option is employed.
- Molpro 2024.1.0 built with Intel MPI (/apl/molpro/2024.1.0) is free from the issue so far.
- MVAPICH version is also OK. However, in some types of calculations, MVAPICH version is terribly slow.
- Open MPI version works fine if "--ga-impl ga" option is added. That option is properly added in sample job scripts.
- OpenMolcas will meet errors when large number of MPI processes are involved.
- It seems more likely that there is a problem with OpenMolcas parallel processing (in RI related routines, and many places).
- As far as we are concerned, serial runs are not influenced by this issue. Please increase number of MPI processes gradually.
- MPI version of Siesta 4.1.5 is sometimes very slow when Open MPI and MKL are employed. We leave it unchanged for the time being because we couldn't find a solution
- Intel MPI 2021.7.1 build does not work properly. Not employing MKL and Intel MPI seems to improve the stability of computation speed. It is not remarkable, though.
- Intel Compiler + Scalapack (i.e. without MKL) version failed on scalapack tests.
Completed Issues
- Programs hang when hcoll is enabled.
- This happens for hcoll of HPC-X 2.11 (Open MPI 4.1.4), HPC-X 2.13.1, and MLNX OFED 5.9 (used during FY2022-2023).
- HPC-X 2.11: hcoll v4.7.3208 (1512342c)
- HPC-X 2.13.1: hcoll v4.8.3220 (2e96bc4e)
- MLNX OFED 5.9: hcoll v4.8.3221 (b622bfb) (/opt/mellanox/hcoll)
- Hcoll v4.8.3223 (in MLNX OFED 23.10 (updated in maintenance on Apr 1-4, 2024) and HPC-X 2.16 (openmpi/4.1.5-hpcx2.16)) is not verified.
- Libhcoll related issues were practically solved by disabling it in the maintenance on Jan 9, 2024.
- If you want to use libhcoll with Intel MPI, please remove/modify I_MPI_COLL_EXTERNAL environment variable in your job script.
- If you want to use libhcoll with Open MPI, please remove/modify OMPI_MCA_coll environment variable in your job script.
- This happens for hcoll of HPC-X 2.11 (Open MPI 4.1.4), HPC-X 2.13.1, and MLNX OFED 5.9 (used during FY2022-2023).
- Intel MPI problem at the beginning of the operation (in Feb 2023) was solved by modifying parameters of queuing system.
- GAMESS parallel run problem at the beginning of the operation (in Feb 2023) was solved by using Open MPI without hcoll.
- Please don't try oversubscribing (e.g. ncpus=32:mpiprocs=64). It doesn't improve the performance (even when "setenv OMPI_MCA_mpi_yield_when_idle 1" is active). Although the number of computation process is a half of the total processes, it shows a better performance than the other conditions.
- NWChem problem in TDDFT etc. at the beginning of the operation (in Feb 2023) was solved by changing ARMCI_NETWORK build option from OPENIB to MPI-PR.