You are here

FY2022 System Renewal Details

(Last update: Mar 6, 2023) Information about the new system to be launched in February, 2023.

Common Questions

  • For formchk of Gaussian, please check this FAQ item.
  • For "Cgroup mem limit exceeded" message, please check this FAQ item.
  • For "No space left on device" error, please check this FAQ item.
  • Please note that the memory amount per CPU core is about half of the previous system. You need to use twice as many cores as the previous system to get comparable amount of memory.
  • There are additional limitations for core (ncpus<64 or gpu jobs) and jobtype=largemem jobs. The limit values can be shown by "jobinfo -s" command.
  • If you run too many jobs in a short period of time, penalty might be imposed.
    • If you are planning to run thousands of jobs in a day, please merge them.
  • In case "" is loaded in ~/.bashrc, sftp (including WinSCP) failed to connect to ccfep due to the output from that script.
    • You may be able to avoid this error by discarding output from like "source ~/intel/oneapi/ >& /dev/null"
    • (Load only if $PS1 is not null should work. Just move that line to ~/.bash_profile may also work.)

Known Issues (Last Update: Mar 6, 2023)

  • (Completed items are moved to the bottom of this page.) (Mar 6)
  • waitest is not available now.
  • The following applications are not yet installed. Please wait for a while.
    • DIRAC 22, CP2K 2023.1, Quantum ESPRESSO 7.1, Julia 1.8.5
  • If non-ASCII characters are involved in your job submission directory path, jobinfo -w can't show the correct path. (Feb 21)
    • On Feb 17-18, jobinfo output error messages due to this problem. This jobinfo error is solved now. (Feb 21)
    • We are still investigating this issue.
  • hcoll library included in HPC-X output warning (error?) messages starting from [LOG_CAT_P2P] or/and [LOG_CAT_MCAST] in some cases. (Mar 1)
    • You can suppress these messages by adding "-mca coll_hcoll_enable 0" to mpirun. (disable hcoll)
    • However, there can be a problem in communication between processes when these messages are shown. (e.g. GAMESS built with HPC-X 2.11) Please do not try to delete messages without investigation.
      • (In GAMESS case, just suppressing messages do not improve performance. It only made the problem harder to detect.)
    • If the problem is only for MCAST, messages can be removed by passing "-x HCOLL_ENABLE_MCAST=0" to mpirun. It is likely that this is not a serious problem in this case.
      • This can be caused by "knem" module (enabled both on compuation and frotnend nodes). Not yet investigated in this center.
    • /apl/openmpi/3.1.6 is free from hcoll. These messages won't be shown.
    • Jobs involved in system trouble might become "Hold" state unexpectedly and jobinfo says that is due to the "(error)". (Feb 15)
      • This event itself had been easily solved by just rerunning the job in the previous system. In contrast, rerun of the job couldn't solve the problem in the new system. (Feb 15)
        • (Rerunning of job is possible. But after the completion, it returns to the original error state.)
        • If you don't want rerunning of the jobs, please add "#PBS -r n" in your job script. (Please also check reference manual.)
      • In some cases, you couldn't remove those jobs by using jdel command. If you want to delete those jobs, please tell us job IDs. We will force to remove these jobs. (Feb 15)
        • (You can leave them if you don't mind. They will be removed in the maintenance on Mar 6.)
      • We are still investigating this issue. (Feb 15)
      • The (error) reason disappeared after the trouble of queuing system on Mar 2. However, there are no other changes in the status of jobs. The problem still exists. (Mar 3)
        • molpro 2022.3 strangely freezes occasionally. (Feb 6)
          • Molpro 2022.2.2 also has this issue. Not confirmed with molpro 2015
          • Disk option introduced in molpro 2021.2 seems to be the trigger of the deadlock. We added "--ga-impl ga" in sample job scripts to disable disk option. (Feb 8)
            • HPC-X 2.13.1 related issue is not found for molpro. (Feb 8)

          How to load my oneAPI environment (Mar 3 update)

          oneAPI Base Toolkit can be downloaded from this page. Compilers, MKL, and MPI are included. Please use offline or online one for Linux. You need to install oneAPI HPC Toolkit if you need fortran compiler (ifort, ifx). Please download from this page. (Mar 3)

          For bash users, following module method can work. But just loading ~/intel/oneapi/ is easier.
          You can use individual component of oneAPI such as compilers and MKL by loading "" file in each component directory.
          (e.g. source ~/intel/oneapi/compiler/latest/env/

          We here introduce a simple way using module command. There maybe several ways to do it.
          It is assumed that oneAPI is already installed under ~/intel.

          $ cd ~/intel/oneapi
          $ sh
          $ cd modulefiles/
          $ module use .
          $ module save

          This will gather oneapi module files into modulesfiles/ directory and that directory is registered in the module search path.
          Final module save command saves the setting. The saved environment will be restored upon login in RCCS system.
          If you want to use Intel compilers, you need to run "module load compiler/latest".
          (Please check "module avail" for details about packages and their versions.)

          You can load compilers and other packages before "module save".
          In this case, you can use compilers and libraries immediately after your next login.

          $ cd modulefiles/
          $ module use .
          $ module load compiler/latest
          $ module load mkl/latest
          $ module load mpi/latest
          $ module save

          If you want to remove saved module environment, please try "module saverm". Please note that your saved environment is independent from the system default setting. Changes in the system default setting do nothing to your environment.



          node type CPU, GPU memory [GiB] # of nodes Local scratch
          on NVMe SSD [GB]
          TypeC AMD EPYC 7763 (64 cores 2.45 GHz) [2 CPUs/node] 256 804 1536 for vnode/core jobs
          TypeF 1024 14 for large memory vnode jobs
          TypeG AMD EPYC 7763 (64 cores 2.45 GHz) [2 CPUs/node]
          NVIDIA A100 NVLink 80 GB [8 GPUs/node]
          256 16 for GPU jobs
          • Node jobs will be assigned to vnode(s). A vnode has 64 cores.
          • In addition to global scratch space (corresponding to /work in the previous system), local scratch space on NVMe SSD is available for jobs.


          14.8 PB (Lustre)



          Queue Classes and Queue Factor for FY2023

          Node jobs will be assigned vnodes (64 cores/vnode). A computation node consists of two vnodes.

          # of total vnodes
          (# of total cores)
          limit for a job memory
          group limit queue factor assign
          node type
          assigned points cores/GPUs # of jobs
          28 vnodes
          (1,792 cores)
          1-14 vnodes
          (64-896 cores)
          7.875 7,200,000+
          1,000 60 points / (1 vnode * 1h) vnode TypeF
          1,248+ vnode
          (79,872+ cores)
          1-50 vnodes
          (64-3,200 cores)
          1.875 45 points / (1 vnode * 1h) TypeC
          200+ vnodes
          (12,800+ cores)
          1-63 cores 1 point / (1 core * 1h) core
          32 vnodes
          (2,048 cores
          128 GPUs)
          1-48 GPUs
          1-16 cores/GPU
          60 points / (1 GPU * 1h)
          1 point / (1 core * 1h)
          - - 1.875 - - - 45 points / (1 vnode * 1h)


          • Jobs must be finished before the scheduled maintenance.
          • Only around half of available nodes may accept long jobs (more than a week).
          • You can omit jobtype in your jobscript except for jobtype=largemem; other types can be judged from the requested resources.
          • 80 nodes of TypeC (160 vnodes) will be shared by "vnode" and "core" jobs.
          • Short "vnode" jobs can be assigned to a "largemem" node.
          • Short "core" jobs can be assigned to a "gpu" node.
          • (Queue factors during FY2022 is different from these values.)

          How to login and how to send/recv files

          Once user account is assigned, you can login to the login node (
          Login guide is available in this page. The procedure itself is almost the same as the previous system.

          How to submit jobs (jsub)

          You need to prepare jobscript to submit a job.
          Creating job script from scratch is not an easy task. We recommend you to use a sample jobscript as a template.
          Links in "Job Submission Guides" section of the quick start guide page may be helpful in this regard.
          In case of Gaussian, please check g16sub guide below. Some resource definition examples are shown below.

          example 1: use 5 nodes (640 (128*5) cores, 320 (64*5) MPI)

          #PBS -l select=5:ncpus=128:mpiprocs=64:ompthreads=2

          example 1.5:  use 10 vnodes (64 for each) (640 (64*10) cores, 320(32*10) MPI)

          #PBS -l select=10:ncpus=64:mpiprocs=32:ompthreads=2

          example 2: 16-core (16 MPI)

          #PBS -l select=1:ncpus=16:mpiprocs=16:ompthreads=1

          example 3: 16-core (16 OpenMP) +  1 GPU

          #PBS -l select=1:ncpus=16:mpiprocs=1:ompthreads=16:ngpus=1

          note: there are 8 GPUs in a node. # of CPU cores per GPU (ncpus/ngpus) must be <= 16.

          example 4: 64 cores (1 vnode), large memory node (~500 GB of memory / vnode)

          #PBS -l select=1:ncpus=64:mpiprocs=32:ompthreads=2:jobtype=largemem

          Job type specification, jobtype=largemem, is necessary for this case.

          Changes from the previous system

          • You can omit queue specification (-q H).
          • You can also omit jobtype from the resource definition except for large memory jobs (jobtype=largemem).
          • You can use local disk as scratch on computation nodes. (Can't be accessed directory from other nodes) The path is /lwork/users/${USER}/${PBS_JOBID}. This directory is removed at the end of the job.
            • The capacity is 11.9 GB * ncpus, where this ncpus is # of cpu cores available to you in this node (not total cpu core count of your jobs).
            • in case of select=2:ncpus=64:mpiprocs=64...,  64 * 11.9 GB will be available in each node.
          • You can use huge scratch space of /gwork/users/${USER}. This is corresponding to /work/users/${USER} of the previous system. This disk is shared among nodes.


          How to check job status (jobinfo)

          Status of submitted jobs can be checked by jobinfo command at login node (ccfep).
          Basic usage and sample output are shown below. You don't need to specify queue name (-q H).
          In other respects, the new one is almost identical to the previous system's one.

          jobinfo -c

          You can see the latest status of your jobs. However, some details such as number of GPUs are not available.
          Combinations with other options are also restricted.

          [user@ccfep ~]$ jobinfo -c

          Queue   Job ID Name            Status CPUs User/Grp       Elaps Node/(Reason)
          H       4008946 sample3.csh    Run      16  uuu/---     1:51:14 ccc001        
          H       4008952     Run      16  uuu/---     1:51:08 ccc001        
          H       4010452  Queue     1  uuu/---     0:02:40 (gpu)         
          Job Status at 2023-01-28 14:00:12


          You can get detailed information of your jobs. There will be a delay of up to several minutes, though.

          (You may want to use this combined with -m or other options.)

          [user@ccfep ~]$ jobinfo

          Queue   Job ID Name            Status CPUs User/Grp       Elaps Node/(Reason)
          H(c)    4008946 sample3.csh    Run      16  uuu/ggg     1:51:14 ccc001        
          H(c)    4008952     Run      16  uuu/ggg     1:51:08 ccc001
          H(g)    4010452  Queue   1+1  uuu/ggg     0:02:40 (gpu)          
          Job Status at 2023-01-28 14:00:12
          For the latest status, please use '-c' option without '-m', '-w', '-s'.

          jobinfo -s

          Summary of CPU and GPU usages of you and your group will be shown. Queue summary information is also shown.

          [user@ccfep ~]$ jobinfo -s

          User/Group Stat:
           queue: H                       | user(qf7)             | group(za0)           
             NJob (Run/Queue/Hold/RunLim) |    0/    0/    0/-    |    0/    0/    0/2560
             CPUs (Run/Queue/Hold/RunLim) |    0/    0/    0/-    |    0/    0/    0/2560
             GPUs (Run/Queue/Hold/RunLim) |    0/    0/    0/-    |    0/    0/    0/48  
             core (Run/Queue/Hold/RunLim) |    0/    0/    0/600  |    0/    0/    0/600 
          note: "core" limit is for per-core assignment jobs (jobtype=core/gpu*)

          Queue Status (H):
                job         | free   |     free     | # jobs  |  requested  
                type        | nodes  | cores (gpus) | waiting | cores (gpus)
          week jobs
          1-4 vnodes        |    706 |  90368       |       0 |      0      
          5+  vnodes        |    506 |  64768       |       0 |      0      
          largemem          |     14 |   1792       |       0 |      0      
          core              |    180 |  23040       |       0 |      0      
          gpu               |     16 |   2048 (128) |       0 |      0 (0)  
          long jobs
          1-4 vnodes        |    326 |  41728       |       0 |      0      
          5+  vnodes        |    226 |  28928       |       0 |      0      
          largemem          |      7 |    896       |       0 |      0      
          core              |     50 |   6400       |       0 |      0      
          gpu               |      8 |   1024 (64)  |       0 |      0 (0)  


          How to submit Gaussian jobs (g16sub/g09sub)

          Usually, jobs are submitted by jsub command explained above. However, special command g16sub is available for Gaussian16.
          g16sub will generate jobscript from your Gaussian input, and the submit the generated job.
          In the default setting, 8 cores will be used for the calculation with 72 hours time limit.
          (g09sub command for Gaussian09 is also available. The usage is almost identical to g16sub.)

          basic usage ( 8 cores、72 hours)

          [user@ccfep somewhere]$ g16sub input.gjf

          more cores, more longer (16 cores、168 hours)

          [user@ccfep somewhere]$ g16sub -np 16 --walltime 168:00:00 input.gjf


          • If you want to use large memory node (jobtype=largemem) you need to add "-j largemem".
            • Jobtype=vnode will be used if -np 64 or -np 128 is specified.

          Compiler Environment

          gcc, aocc, NVIDIA HPC SDK are already installed.
          For Intel oneAPI, only the libraries are installed. Compilers (icc, ifort, icpc etc.) are not installed.
          If you need Intel compilers, please install oneAPI Base Toolkit or/and oneAPI HPC Toolkit by yourself into your directory.

          For gcc,  system default one (8.5) and gcc-toolset ones (versions 9.2, 10.3, and 11.2) are installed.
          You can use gcc-toolset gccs by module command.  (e.g. module load gcc-toolset/11)


          Software related issues

          List of installed software is available in Package Program List page.
          You can see the similar list by "module avail" command. (To quit module avail, press "q" key or scroll to the bottom of the page.)

          Other minor notices are listed below.

          • Many software is quitting preparation of config script in csh. We recommend csh users to use module command.


            module command (Environment Modules)

            • In jobscript, csh users need to run "source /etc/profile.d/modules.csh" before using module command.
              • In the jobscript, "source /etc/profile.d/modules.csh" is necessary for /bin/bash users.
              • ". /etc/profile.d/" is necessary for /bin/sh or /bin/bash jobscript if your login shell is /bin/csh
            • You should add -s to module command in the script. (e.g. module -s load openmpi/3.1.6)
            • You can save current module status by "module save" command. The saved status will be restored automatically upon login.


            Completed Issues

            • "remsh" command to check local scratch (/lwork) and ramdisk (/ramd/users/${USER}/${PBS_JOBID}) of computation nodes is not yet available.(done)
            • joblog is in preparation.
            • jobinfo -s, jobinfo -n, jobinfo sometimes show error message regarding json. (Feb 15) This should have been fixed. (Feb 16)
              • jobinfo -c won't be affected by this issue.
              • In case the error is shown, please run the same command again. The correct result will be shown. We will fix this issue soon. (Feb 15) Done. (Feb 16)
            • New g16sub/g09sub uses local disk /lwork as the scratch space in default setting. If option "-N" is added, shared disk /gwork (large but slow) will be used as the scratch space instead of local disk /lwork (small but fast). (Feb 1)
            • AlphaFold 2.3.1 is available. (Feb 6)
            • remsh is installed (/usr/local/bin/remsh). (Feb 6)
            • joblog is installed (/usr/local/bin/joblog). (Feb 6)
              • CPU points of jobs in this February are also shown, but the points are not consumed.
            • NBO 7.0.10 installed. Gaussian 16 C.01, C.02 use it for NBO7. (Feb 14)
            • Python 3.10 environment (miniforge) was prepared in /apl/conda/20230214. Please source /apl/conda/20230214/ or /apl/conda/20230214/conda_init.csh to load the environment. (Feb 14)
            • Additional limit will be added for jobtype=largemem during the maintenance on Mar 6 (Mon). (Feb 24)
              • The limit will be 896 cores (7 nodes) per group.
            • CPU core allocation rule for jobtype=core and ompthreads > 1 modified. (Feb 20)
              • In case of OpenMPI, "mpirun --map-by slot:pe=${OMP_NUM_THREADS}" or "mpirun --map-by numa:pe=${OMP_NUM_THREADS}" would work fine. An MPI process and its OpenMP threads will be assigned to a single NUMA node (consists of 16 cores).
                • (The mapping setting above should be applicable to jobtype=vnode (ncpus=64 or ncpus=128) runs. There can be some differences in performance between slot and numa assignment.)
              • In case of ncpus=20:mpiprocs=5:ompthreads=4, 4 cores from each of five NUMA nodes are available to your job.
              • (The rule heavily depends on # of mpiprocs. Please ask us if you have problem about this.)
              • For gpu jobs, --map-by slot:pe=${OMP_NUM_THREADS} would work if ompthreads > 1 && ncpus/ngpus <= 8 && ngpu <=2. (Feb 22)
                • If you have trouble with parallel performance (especially about OpenMP), please ask us. (Feb 22)
            • Intel MPI problem was solved by modifying parameters of queuing system. (Feb 3)
              • export I_MPI_HYDRA_BOOTSTRAP=pdsh is still necessary parameter for multi node jobs. This is thus added to the default setting. You don't need to set this manually. (Feb 3)
              • GRRM17 MPI run does not work. We are now investigating this issue. Normal run of GRRM17 (without MPI) works fine.
              • At this time, it may be safer to avoid Intel MPI if possible. Intel MPI failed to launch process in some cases.
                • When you use Intel MPI installed in your directory, setting "I_MPI_HYDRA_BOOTSTRAP" environment variable to "ssh" might solve the problem.
                • export I_MPI_HYDRA_BOOTSTRAP=pdsh (bash case) seems be better. However, we have already identified cases where this does not fix the problem. (Feb 1)
            • HPC-X 2.13.1 has problems in many applications. We are planning to review software using HPC-X 2.13.1. (Feb 7)
              • Survey completed. (Feb 13)
              • Problems related to HPC-X 2.13.1 are not found found for cp2k-9.1, genesis (cpu), lammps (CPU), molpro, namd (CPU), nwchem, openmolcas, siesta. (Feb 25)
                • (If you know issues of those application, please tell us.)
                • (namd gpu version does not use MPI... sorry.)
              • namd-2.14 (CPU version) failed when large number of MPI processes are employed. solved. (Feb 22)
                • Switching runtime library of OpenMPI to HPC-X 2.11 (OpenMPI-4.1.4) seems to fix the problem.
                • Sample and module files are modified. Please use these fixed ones.
              • genesis (CPU version) failed when large number of MPI processes are employed. solved. (Feb 25)
                • Switching runtime library of OpenMPI to HPC-X 2.11 (OpenMPI-4.1.4) seems to fix the problem.
                • Sample and module files are modified. Please use these fixed ones.
              • When performing a multi-node parallel computation of Gromacs (2021.4, 2021.6, 2022.4) built with gcc and HPC-X 2.13.1 (OpenMPI-4.1.5), there was a case where the process did not terminate although the computation seemed to be finished. solved (Jan ??)
                • In this case, the problem was solved by using HPC-X 2.11 instead of 2.13.1.
              • Problem identified for AMBER (pmemd GPU) built with gcc+HPC-X 2.13.1. (Feb 1) solved (Feb 1)
                • Just changing runtime library to HPC-X 2.11 seems to fix the problem. (Feb 1)
                • module file of AMBER was modified. Now amber uses HPC-X 2.11 for runtime library. (Feb 1)
              • QE-6.8 where Intel compiler+HPC-X 2.13.1 are used failed almost certainly in multinode jobs. (Feb 7) solved (Feb 7)
                • Changing runtime library from HPC-X 2.13.1 to HPC-X 2.11 fixed the problem. (Feb 7)
              • lammps 2022-Jun23 (GPU version) crashed with MPI error. (Feb 8) solved (Feb 10)
                • changing runtime library from HPC-X 2.13.1 to HPC-X 2.11 solved the problem. (Feb 10)
                • In addition, we have prepared a wrapper script for efficient handling of multiple GPUs in a node. (Feb 10)
              • genesis 2.0.3-cuda crashed with MPI error if multiple nodes involved. (Feb 10) solved (Feb 10)
                • changing runtime library from HPC-X 2.13.1 to HPC-X 2.11 solved the problem. (Feb 10)
            • /usr/bin/python3 (currently 3.9) will be replaced with python 3.6.8 in the maintenance on Mar 6 (Mon). (Feb 7)
              • This is to fix the various problems related to python.
              • Applications depending on python 3.9 (OpenMolcas and NWchem-7.0.2) will also be replaced on Mar 6.
            • /usr/bin/perl (currently version 5.32) will replaced with perl version 5.26 in the maintenance on Mar 6 (Mon). (Mar 3)
            • Currently it is not possible to send files via scp to other compute nodes during a job. We are currently working on it. (Feb 2) Fixed (Feb 6)
            • There is a problem in gamess parallel run. We have been investigating this issue. (Feb 9)
              • Please use /apl/gamess/2022R2-openmpi or /apl/gamess/2021R1-openmpi for GAMESS parallel runs for the time being. (Feb 10)
              • They will be /apl/gamess/2022R2 and /apl/gamess/2021R1 in the maintenance on Mar 6 (old ones will be removed). (Feb 10)
                • Note: inter-node communications are not optimized in these openmpi versions (please ignore UCX error messages). This problem will be addressed in the preparation of next version along with MPI library configuration. (Feb 10)
              • On Mar 6, /apl/gamess/2022R2 and /apl/gamess/2021R1 will be replaced with Open MPI 3 version. (Feb 24)
                • This version is free from UCX issue above; multi-node parallel performance is better than the others.
                • Please don't try oversubscribing (e.g. ncpus=32:mpiprocs=64). It doesn't improve the performance (even when "setenv OMPI_MCA_mpi_yield_when_idle 1" is active). Although the number of computation process is a half of the total processes, it shows a better performance than the others.
            • nwchem has problem with multinode runs (TDDFT case is confirmed). HPC-X 2.13.1 is unrelated to this issue. (Feb 9)
              • Please use /apl/nwchem/7.0.2-mpipr for NWChem 7.0.2. (Feb 13)
              • In case of reactionplus, please use /apl/reactionplus/1.0/nwchem-6.8-mpipr. (Feb 13)
                • Samples and module files are modified to load new ones (-mpipr).
              • In the maintenance on Mar 6, they will be re-installed as /apl/nwchem/7.0.2 and /apl/reactionplus/1.0/nwchem-6.8.
              • This may be due to ARMCI_NETWORK=OPENIB setting. We have been investigating issue. (Feb 9)
                • Changing ARMCI_NETWORK from OPENIB to MPI-PR fixes the problem. (Feb 13)
            • OpenMolcas is left unchanged for the time being. (Feb 13)
              • Test 908 works fine with 16 MPI processes, but failed with 32 MPI processes. This depends purely on number of processes; intra- or inter-node is nothing related. (Feb 13)
              • This may be due to --with-openib setting of GlobalArrays. We have been investigating issue. (Feb 10)
                • --with-mpi-pr or --with-mpi3 one does not work at all. --with-openib or --with-mpi (default) one works if number of processes is enough small. (Feb 13)
                • Using Intel MPI instead of Open MPI does not help. Probably not a problem of MPI implementation. OpenMP works fine. (Feb 13)
            • MPI version of Siesta is sometimes very slow when Open MPI and MKL are employed. We leave it unchanged for the time being because we couldn't find a solution. (Feb 13)
              • Intel MPI version does not work properly. (Feb 13)
              • Not employing MKL and Intel MPI seems to improve the stability of computation speed. But it is not remarkable. (Feb 13)
              • Intel Compiler + Scalapack (i.e. without MKL) was not tested because scalapack tests were not passed in this case. (Feb 13)