Alex Granovsky
gran@classic.chem.msu.su
thanks so much for very detailed and useful instructions!
I'd only add my two cents:
With Intel MPI over Infiniband, the following additional input
should speed things up:
$system mxbcst=-1 $end $mpi mxgsum=1048576 $end $mpi mnpdot=1000000 $end
The second option sets the maximum size of message for collective
MPI operations to 8 megabytes (ca. 1 MegaWord) and allocates the
dedicated buffer for these messages. The default size is much
smaller; however, with fast interconnects and robust MPI
implementations like Intel MPI is, it is beneficial to use
larger sizes. You can play a bit with this option to fine-tune
performance.
The last option sets lower bond on the size of "large" vectors,
in words. With Firefly, large means that their size is sufficiently
large to run some BLAS level 1 operations like scalar products
etc... in parallel, rather than duplicate them on every node.
Regards,
Alex Granovsky