Firefly/PC GAMESS DOCUMENTATION - LARGE SCALE CI, MCSCF, and QDPT2


   Firefly and PC GAMESS/Firely LARGE-SCALE PARALLEL MCSCF CODE DOCUMENTATION

   Throughout this document we'll denote "large-scale" any CI, MCSCF, MCQDPT2, 
or XMCQDPT2 calculations of systems having large number of basis functions 
(e.g., 1500) and relatively small active spaces (e.g., 12 electrons/12 orbitals). 
In this case, the overall costs of calculations are mainly dominated by the 
integral transformation & effective Fock matrix construction steps.

   To speed up these stages, PC GAMESS/Firefly includes special fast direct and 
conventional integral transformation algorithms based on the fastints/gencon 
code. They have excellent parallel scalability and very modest memory requirements. 
They are available for FORS (CAS)-type CI (both GUGA & ALDET), SOSCF/FOCAS MCSCF, 
MCQDPT2, and XMCQDPT2  calculations only.

Below is the summary of the most relevant options for high-performance large-scale
CI/MCSCF/MCQDPT2/XMCQDPT2 jobs:

   0. $contrl fstint=.t. gencon=.t. $end   Use fastints/gencon code

   1. $system kdiag=0 nojac=100 $end  Instructs PC GAMESS/Firefly to use fast
diagonalization routines if available and never use Jacobi diagonalization
for matrices of size 100x100 and above.

   2. $p2p p2p=.t. dlb=.t. $end    Instructs PC GAMESS/Firefly to use dynamic
load balancing over p2p interface during parallel runs. This is the best
strategy, although static load balancing will work as well.

   3. $trans mptran=2 dirtrf=.t. aoints=dist altpar=.t. mode=gsm $end
Instructs PC GAMESS/Firefly to select fast alternative algorithm for integral
transformation, using either its direct variant (preferred), or conventional
with 2-e integrals distributed over nodes, and selects alternative (more scalable) 
parallel strategy for MSCSF runs. gsm is 3-digit decimal number defining 
the details of the direct parallel transformation code to be used. g can be 
either 0 or 1, and means either to use (1) or not to use (0) gencon version 
of the fastints code. s can be either 0 or 1, and means either to use
(1) or not to use (0) approximate Schwarz integral screening (note that even
if approximate screening is disabled, the exact Schwarz screening will be
nevertheless in effect by default). Finally, m can be 0, 1, and 2, and means
small (0), medium (1), or large (2) active space. Thus, mode=112 is the most
appropriate for most runs, but for very small active spaces, mode=110 or 111
will perform faster.

   4. $ciinp  castrf=.t. $end   Selects fast MCSCF-like integral transformation
for standalone CI runs.

      $transt castrf=.t. $end   Selects fast MCSCF-like integral transformation
for standalone CI transtion moment/spinorbit runs.

   5. $gugem pack2=.t.   $end   Selects additional packing of GUGA CI matrix for
GUGA-style CI or MCSCF.

   6. $mcscf fullnr=.f. soscf=.t. $end  or
      $mcscf fullnr=.f. soscf=.f. focas=.t. $end   Selects supported fast MCSCF
algorithms (note fullnr does not presently support fastints/gencon).
See also: