PC GAMESS/Firefly DOCUMENTATION - running in parallel Linux/MPICH binaries

How to run the PC GAMESS/Firefly in parallel - Linux/MPICH case:

To run Linux/MPICH version of the PC GAMESS/Firefly in parallel you'll need:

Either several (at least two) Intel/AMD-based Linux boxes having identical or similar hardware configuration and running in the local network environment. Each computer can be either single-CPU workstation, or dual (four, eight, etc..)-CPU (or core) SMP/multicore system, it does not matter.
Or, alternatively, single Intel/AMD-based SMP/multicore system running under Linux. In this case, it is desirable (although not necessary) to have the high-quality hardware RAID controller installed as well. This will improve the overall performance of disk-intensive jobs considerably. Another things that can help are:
1. the use of direct computation methods and/or
2. use of several physical disks to run each PC GAMESS/Firefly process so that each copy uses only its own HDD.
TCPIP protocol must be enabled and configured correctly on each system.
You may need to install MPICH binaries and (note: 32-bit!) libraries. You can download MPICH from the MPICH homepage. Please consult with the MPICH documentation and manual pages before start experimenting with parallel PC GAMESS/Firefly runs.
The MPICH-linked PC GAMESS/Firefly binaries should present on all the computers you plan to run the PC GAMESS/Firefly in parallel
Finally, one has to carefully read these MUST READ documents:
- Documentation on the command line options
- Documentation on the P2P communication interface

How to run the PC GAMESS/Firefly in parallel?

Create pcgamess.pg or procgrp file that is suitable for your environment. Please read the MPICH documentation for information on how this file should be organized. This file must reside on the computer where the master PC GAMESS/Firefly copy will run.

The simplest command line for the parallel PC GAMESS/Firefly run is as follows:

./pcgamess DIR0 DIR1 DIR2 ... DIRN < MPICH options >

Here, DIR0, DIR1, DIR2, etc... are the working directories of the master PC GAMESS/Firefly process (i.e., of MPI RANK=0), second instance of PC GAMESS/Firefly (MPI RANK=1), third instance, and so on. Only absolute paths are allowed.

< MPICH options > are the optional MPICH-specific options (see MPICH documentation for the list).

For example, you can use something like following:

./pcgamess /home/me/mydir/wrk0 /home/me/mydir/wrk1 "/home/me/my dir/wrk2" -p4pg /home/me/procgrp

Depending on the cluster topology used, the three directories above must exist prior to PC GAMESS/Firefly execution either on the single computer, two different computers, or three different computers. The input file must be in the master working directory (i.e., in the /home/me/mydir/wrk0 for the example above).

Alternatively, instead of creation of custom procgrp files and passing them as an argument directly to the PC GAMESS/Firefly binaries, you can use the mpirun command to launch the PC GAMESS/Firefly in parallel.

Assume you have two nodes, e.g., node pb1 (master node) and node pb2 (slave). Assume you want to use /scratch/pcgamess on pb1 and /tmp/pcgamess on pb2 to store temporary working files, and on both systems the PC GAMESS/Firefly binaries reside in /home/peter/bin (with /home most likely being mounted via NFS or so on both systems).

Then you should:

Put input file into the /scratch/pcgamess on pb1
Put fastdiag.ex, pcgp2p.ex, and p4stuff.ex (if any) into /scratch/pcgamess on pb1 and into /tmp/pcgamess on pb2

Run pc gamess from pb1 by either typing:

   mpirun -np 2 /home/peter/bin/pcgamess /scratch/pcgamess /tmp/pcgamess

to get output on stdout, or

   mpirun -np 2 /home/peter/bin/pcgamess -o /home/peter/test.out /scratch/pcgamess /tmp/pcgamess

to redirect output into the /home/peter/test.out on pb1

Another way is:

   /home/peter/bin/pcgamess -o /home/peter/test.out /scratch/pcgamess /tmp/pcgamess -p4pg /home/peter/procgrp

with /home/peter/procgrp file like this (assuming pb1 is a master):

local 0
pb2   1 /home/peter/bin/pcgamess

A couple of useful hints

You can use P4_RSHCOMMAND environment variable to set the remote shell to be used by MPICH to launch processes on the remote nodes. E.g.,
```
     export P4_RSHCOMMAND=ssh
```
Note that fully statically linked Linux/MPICH PC GAMESS/Firefly binaries are indeed fully statically linked, including various nss libraries etc..., to avoid glibc compatibility issues, as well as to increase to some extent the amount of memory available to the program. However, in some cases this causes problems on misconfigured standalone systems and on NIS-based clusters.

In particular, if you are getting the following error message:
```
     p4_error: create_procgroup: getpwuid failed: 0
```
the solution is to write simple script and to "mpirun" it but not the PC GAMESS/Firefly binaries themselves. The script is as follows (edit it to fit your paths):
#! /bin/bash ssh -t -t $HOSTNAME /home/gran/pcgamess "$@"

To use this workaround, you need to be able to login locally on computing nodes via ssh using passwordless authentication. If this does not solve your problem, or you are running PC GAMESS/Firefly on NIS-based cluster, try to use dynamically or fully dynamically linked MPICH binaries.

Known issues and problems

While running PC GAMESS/Firefly in parallel using standalone SMP system, the performance degradation is possible because of simultaneous I/O operations. In this case, the use of high-quality RAID or separate physical disks can help. If the problem persist, for dual- (and more, 4, 8, for example)-CPUs/cores SMP/multicore systems the better solution is probably to switch to the direct computation methods which require much less disk I/O.
The default value for AOINTS is DUP. It is probably optimal for low-speed networks (10 and 100 Mbps Ethernet). On the other hand, for faster networks and SMP systems the optimal value could be AOINTS=DIST. You can change the default by using the AOINTS keyword in the $SYSTEM group. So, you can check what is the faster way for your systems.

There are four keywords in the $SYSTEM group which can help in the case of MPI-related problems. Do not modify the default values unless you are absolutely sure that you need to do this. They are as follows:

        MXBCST (integer) - the maximum size (in DP words) of the message
                           used in broadcast operation. Default is 32768.
                           You can change it to see whether this helps

        MPISNC (logical) - activates the strategy when the call of the
                           broadcast operation will periodically
                           synchronize all MPI processes, thus freeing
                           wp4 global memory pool.
                           Default is false. Setting it to true should
                           resolve most buffer-overflow problems by the
                           cost of somewhat reduced performance.

        MXBNUM (integer) - the maximum number of broadcast operations
                           which can be performed before the global
                           synchronization call is done.
                           Relevant if MPISNC=.true. Default is 100.

        LENSNC (integer) - the maximum total length (in DP words) of all
                           messages which can be broadcasted before the
                           global synchronization call is done.
                           Relevant if MPISNC=.true. Default is dependent
                           on the number of processes used (meaningful values
                           vary from 20000 to, say, 262144 or even more).

How to run the PC GAMESS/Firefly in parallel - Linux/MPICH case:

To run Linux/MPICH version of the PC GAMESS/Firefly in parallel you'll need:

How to run the PC GAMESS/Firefly in parallel?

A couple of useful hints

Known issues and problems

See also: