Standalone Intel/AMD-based SMP/multicore system running under Windows NT 4.0/Windows 2000/Windows XP/Windows Server 2003/Windows Vista/Windows Server 2008/Windows 7. It is desirable (although not necessary) to have the high-quality hardware RAID controller installed as well. This will improve the overall performance of disk-intensive jobs considerably. Another things that can help are:
The required library mpich_smp.dll is included into the PC GAMESS/Firefly distribution, so that you do not need to install any additional packages.
You has to carefully read these MUST READ documents:
Finally, you should replace the default mpibind.dll by the MPI binding dll for NT-MPICH-SMP
The simplest command line for the parallel PC GAMESS/Firefly run is as follows:
      PCGAMESS.EXE DIR0 DIR1 DIR2 ... DIRN -np number_of_cpu_cores_to_use
Here, DIR0, DIR1, DIR2, etc... are the working directories of the master PC GAMESS/Firefly process (i.e., of MPI RANK=0), second instance of PC GAMESS/Firefly (MPI RANK=1), third instance, and so on. Both absolute and relative paths are allowed. Relative paths are relative to the initial working directory you launched the PC GAMESS/Firefly from.
For example, you can use something like the following:
      pcgamess.exe d:\mydir\wrk0 "e:\my dir\wrk1" -np 2
to launch PC GAMESS/Firefly on two CPU cores
Note, the directories above must exist prior to PC GAMESS/Firefly execution! The input file must be in the master working directory (i.e., in the d:\mydir\wrk0 for the example above)
While running PC GAMESS/Firefly in parallel using standalone SMP system, the performance degradation is possible because of simultaneous I/O operations. In this case, the use of high-quality RAID or separate physical disks can help. If the problem persist, for dual- (and more, 4, 8, for example)-CPUs/cores SMP/multicore systems the better solution is probably to switch to the direct computation methods which require much less disk I/O.
The default value for AOINTS is DUP. It is probably optimal for low-speed networks (10 and 100 Mbps Ethernet). On the other hand, for faster networks and SMP systems the optimal value could be AOINTS=DIST. You can change the default by using the AOINTS keyword in the $SYSTEM group. So, you can check what is the faster way for your systems.
There are four keywords in the $SYSTEM group which can help in the case of MPI-related problems. Do not modify the default values unless you are absolutely sure that you need to do this. They are as follows:
MXBCST (integer) - the maximum size (in DP words) of the message used in broadcast operation. Default is 32768. MPISNC (logical) - activates the strategy when the call of the broadcast operation will periodically synchronize all MPI processes. Default is false. MXBNUM (integer) - the maximum number of broadcast operations which can be performed before the global synchronization call is done. Relevant if MPISNC=.true. Default is 100. LENSNC (integer) - the maximum total length (in DP words) of all messages which can be broadcasted before the global synchronization call is done. Relevant if MPISNC=.true. Default is dependent on the number of processes used (meaningful values vary from 20000 to, say, 262144 or even more).
Last updated: March 18, 2009