Firefly and PC GAMESS/Firely LARGE-SCALE PARALLEL MCSCF CODE DOCUMENTATION
Throughout this document we'll denote "large-scale" any CI, MCSCF, MCQDPT2,
or XMCQDPT2 calculations of systems having large number of basis functions
(e.g., 1500) and relatively small active spaces (e.g., 12 electrons/12 orbitals).
In this case, the overall costs of calculations are mainly dominated by the
integral transformation & effective Fock matrix construction steps.
To speed up these stages, PC GAMESS/Firefly includes special fast direct and
conventional integral transformation algorithms based on the fastints/gencon
code. They have excellent parallel scalability and very modest memory requirements.
They are available for FORS (CAS)-type CI (both GUGA & ALDET), SOSCF/FOCAS MCSCF,
MCQDPT2, and XMCQDPT2 calculations only.
Below is the summary of the most relevant options for high-performance large-scale
CI/MCSCF/MCQDPT2/XMCQDPT2 jobs:
0. $contrl fstint=.t. gencon=.t. $end Use fastints/gencon code
1. $system kdiag=0 nojac=100 $end Instructs PC GAMESS/Firefly to use fast
diagonalization routines if available and never use Jacobi diagonalization
for matrices of size 100x100 and above.
2. $p2p p2p=.t. dlb=.t. $end Instructs PC GAMESS/Firefly to use dynamic
load balancing over p2p interface during parallel runs. This is the best
strategy, although static load balancing will work as well.
3. $trans mptran=2 dirtrf=.t. aoints=dist altpar=.t. mode=gsm $end
Instructs PC GAMESS/Firefly to select fast alternative algorithm for integral
transformation, using either its direct variant (preferred), or conventional
with 2-e integrals distributed over nodes, and selects alternative (more scalable)
parallel strategy for MSCSF runs. gsm is 3-digit decimal number defining
the details of the direct parallel transformation code to be used. g can be
either 0 or 1, and means either to use (1) or not to use (0) gencon version
of the fastints code. s can be either 0 or 1, and means either to use
(1) or not to use (0) approximate Schwarz integral screening (note that even
if approximate screening is disabled, the exact Schwarz screening will be
nevertheless in effect by default). Finally, m can be 0, 1, and 2, and means
small (0), medium (1), or large (2) active space. Thus, mode=112 is the most
appropriate for most runs, but for very small active spaces, mode=110 or 111
will perform faster.
4. $ciinp castrf=.t. $end Selects fast MCSCF-like integral transformation
for standalone CI runs.
$transt castrf=.t. $end Selects fast MCSCF-like integral transformation
for standalone CI transtion moment/spinorbit runs.
5. $gugem pack2=.t. $end Selects additional packing of GUGA CI matrix for
GUGA-style CI or MCSCF.
6. $mcscf fullnr=.f. soscf=.t. $end or
$mcscf fullnr=.f. soscf=.f. focas=.t. $end Selects supported fast MCSCF
algorithms (note fullnr does not presently support fastints/gencon).
Last updated: February 7, 2010