Starting from the PC GAMESS version 6.3, the support of the new proprietary parallel mode communication interface (which is called P2P interface) was implemented as the part of PC GAMESS. This interface is very flexible and was specifically designed to overwhelm the limitations of MPI & DDI interfaces. The PC GAMESS/Firefly specific parallel MP2 energy and energy gradient method=1 modules supports P2P communication model. It is expected that in the future more and more computational methods and algorithms in the PC GAMESS/Firefly will support P2P.
To take advantages of this interface, you need:
The PC GAMESS/Firefly running in parallel mode over MPI as usually.
The dynamic library in which the P2P interface is implemented. It is called pcgp2p.dll (Win32) or pcgp2p.ex (Linux). The PC GAMESS/Firefly distribution contains the library that is suitable for your OS. It should be placed into the PC GAMESS/Firefly home directories on each computing node in the case of Windows OS and into the PC GAMESS/Firefly working directories (note not into the PC GAMESS/Firefly home directories, or /usr/lib, etc...) on each node in the case of Linux OS. It should be renamed to be all lowercase (Linux).
You should activate the P2P interface adding $P2P P2P=.T. $END to the input file.
Starting from the PC GAMESS v. 6.4, the DLB (dynamic load balancing) functionality was added to P2P interface. To activate DLB, add the following line to your input:
$p2p p2p=.t. dlb=.t. $end
Many of the parallel-aware PC GAMESS/Firefly parts transparently use DLB over P2P interface if DLB is enabled, including 2-e part of direct SCF and DFT, etc...
For parallel MP2 method=1 runs you may find that extended DLB model ($p2p p2p=.t. xdlb=.t. $end) results in slightly better performance than standard DLB model.
Details on the MP2 energy code and p2p interface implementation in the PC GAMESS/Firefly can be found here.
Information on how the DLB affects performance can be found on this page.
Windows specific: the file pcgp2psm.dll contains implementation of the P2P interface that is specific to shared memory SMP/multicore systems. If you run PC GAMESS/Firefly on a standalone SMP/multicore system, rename this file to pcgp2p.dll and replace the default P2P library in the PC GAMESS/Firefly home directory. This will provide better performance than the default library (which uses tcpip rather than shared memory).
For better efficiency of shared memory implementation of P2P, it is recommended to use the following additional P2P settings in the input files:
$p2p mxbuf=2048 $end
Below is the sample PC GAMESS/Firefly input file which uses P2P for DLB-driven direct SCF calculations in parallel mode:
$CONTRL SCFTYP=RHF RUNTYP=ENERGY UNITS=ANGS $END $SYSTEM TIMLIM=600 MEMORY=3000000 $END ! to activate P2P inteface and DLB: $P2P P2P=.T. DLB=.T. $END $BASIS GBASIS=N31 NGAUSS=6 NDFUNC=1 $END ! to speed up Huckel guess: $GUESS GUESS=HUCKEL KDIAG=0 $END $SCF DIRSCF=.T. $END $DATA 6-31G*//RHF/3-21G* Silacyclobutane CS SILICON 14.0 -0.081722 1.055710 CARBON 6.0 -0.081722 -0.395331 1.217568 CARBON 6.0 0.319935 -1.329102 HYDROGEN 1.0 -1.222554 1.998369 HYDROGEN 1.0 1.168317 1.848237 HYDROGEN 1.0 0.604981 -0.419640 2.052727 HYDROGEN 1.0 -1.077445 -0.641554 1.572232 HYDROGEN 1.0 1.388834 -1.500162 HYDROGEN 1.0 -0.184517 -2.285408 $END
The sample input below is the PC GAMESS/Firefly job that uses P2P for parallel MP2 energy calculations:
$CONTRL SCFTYP=RHF RUNTYP=ENERGY UNITS=ANGS MPLEVL=2 $END $SYSTEM TIMLIM=600 MEMORY=3000000 $END ! to activate P2P inteface and extended DLB: $P2P P2P=.T. XDLB=.T. $END $BASIS GBASIS=N31 NGAUSS=6 NDFUNC=1 $END ! to speed up Huckel guess: $GUESS GUESS=HUCKEL KDIAG=0 $END ! to select MP2 method # 1: $MP2 METHOD=1 $END $DATA MP2/6-31G*//RHF/3-21G* Silacyclobutane CS SILICON 14.0 -0.081722 1.055710 CARBON 6.0 -0.081722 -0.395331 1.217568 CARBON 6.0 0.319935 -1.329102 HYDROGEN 1.0 -1.222554 1.998369 HYDROGEN 1.0 1.168317 1.848237 HYDROGEN 1.0 0.604981 -0.419640 2.052727 HYDROGEN 1.0 -1.077445 -0.641554 1.572232 HYDROGEN 1.0 1.388834 -1.500162 HYDROGEN 1.0 -0.184517 -2.285408 $END
The amount of disk storage needed by the parallel method=1 code can be roughly
estimated as 4*n*(n+1)*N*(N+1) (in bytes). This amount will be distributed
(almost evenly) across all nodes. Here, n is the number of active occupied orbitals
(i.e., not counting frozen core occupied orbitals) for RMP2 and is the overall number of both active occupied alpha and beta orbitals
for ROHF and UHF-based MP2. N is the number of Cartesian AOs. Hence, larger jobs will require more nodes to fit into disk requirements.
Linux PC GAMESS/Firefly versions may hang in the case of out-of-disk event in the MP2
code, this is the known problem which will be fixed in the future releases.
Last updated: March 18, 2009