Known Issues

(Last update: Apr 10, 2024)

Known Issues

  • Several problems have been reported for HPC-X 2.13.1 (openmpi/4.1.5-hpcx). Please use other versions of Open MPI or HPC-X.
    • For binaries built with HPC-X 2.13.1 (openmpi/4.1.5-hpcx), please use HPC-X 2.11 (openmpi/4.1.4-hpcx) for runtime library.
  • Molpro has problem with Intel MPI 2021.7.1.
    • Intel MPI 2021.5.1, 2021.8.0, and 2021.9.0 are free from this issue. It hangs only when runtime library of Intel MPI 2021.7.1 is employed.
  • Molpro built with Open MPI sometimes hangs when disk option is employed.
    • Molpro 2024.1.0 built with Intel MPI (/apl/molpro/2024.1.0) is free from the issue so far.
    • MVAPICH version is also OK. However, in some types of calculations, MVAPICH version is terribly slow.
    • Open MPI version works fine if "--ga-impl ga" option is added. That option is properly added in sample job scripts.
  • OpenMolcas will meet errors when large number of MPI processes are involved.
    • It seems more likely that there is a problem with OpenMolcas parallel processing (in RI related routines, and many places).
    • As far as we are concerned, serial runs are not influenced by this issue. Please increase number of MPI processes gradually.
  • MPI version of Siesta 4.1.5 is sometimes very slow when Open MPI and MKL are employed. We leave it unchanged for the time being because we couldn't find a solution
    • Intel MPI 2021.7.1 build does not work properly. Not employing MKL and Intel MPI seems to improve the stability of computation speed. It is not remarkable, though.
    • Intel Compiler + Scalapack (i.e. without MKL) version failed on scalapack tests.

Completed Issues

  • Programs hang when hcoll is enabled.
    • This happens for hcoll of HPC-X 2.11 (Open MPI 4.1.4), HPC-X 2.13.1, and MLNX OFED 5.9 (used during FY2022-2023).
      • HPC-X 2.11: hcoll v4.7.3208 (1512342c)
      • HPC-X 2.13.1: hcoll v4.8.3220 (2e96bc4e)
      • MLNX OFED 5.9: hcoll v4.8.3221 (b622bfb) (/opt/mellanox/hcoll)
    • Hcoll v4.8.3223 (in MLNX OFED 23.10 (updated in maintenance on Apr 1-4, 2024) and HPC-X 2.16 (openmpi/4.1.5-hpcx2.16)) is not verified.
    • Libhcoll related issues were practically solved by disabling it in the maintenance on Jan 9, 2024.
      • If you want to use libhcoll with Intel MPI, please remove/modify I_MPI_COLL_EXTERNAL environment variable in your job script.
      • If you want to use libhcoll with Open MPI, please remove/modify OMPI_MCA_coll environment variable in your job script.
  • Intel MPI problem at the beginning of the operation (in Feb 2023) was solved by modifying parameters of queuing system. 
  • GAMESS parallel run problem at the beginning of the operation (in Feb 2023) was solved by using Open MPI without hcoll.
    • Please don't try oversubscribing (e.g. ncpus=32:mpiprocs=64). It doesn't improve the performance (even when "setenv OMPI_MCA_mpi_yield_when_idle 1" is active). Although the number of computation process is a half of the total processes, it shows a better performance than the other conditions.
  • NWChem problem in TDDFT etc. at the beginning of the operation (in Feb 2023) was solved by changing ARMCI_NETWORK build option from OPENIB to MPI-PR.