Firefly and PC GAMESS-related discussion club


 
Learn how to ask questions correctly  
 
 
We are NATO-free zone
 



Re^3: Firefly on many nodes using MPI

Alex Granovsky
gran@classic.chem.msu.su


Dear Ilia,

>I am using the (4,3) active space.
I see.

>I will try to provide you with a benchmarking curve next week, once I finish traveling.

I think this does not have much sense with your current Firefly
setup, but will be useful when Firefly installation on your cluster
will be configured for optimal performance.

>>The hangs you described looks like MPI-related problems.
>>What is the MPI implementation you are using? Is it OpenMPI?
>I am using OpenMPI version.
That was exactly my suspicion. OpenMPI is slow and buggy.
More precise, its implementation of collective operations
is buggy and can cause Firefly to hang. In general, OpenMPI
is the worst available option Firefly. I'd suggest you to use
mvapich or Intel MPI instead.


>>something else? Does it really run over Infiniband?
>>If it does not, this could explain a limited scalability.  
>I am using infiniband definitely.
Sorry, this needs to be carefully checked.
OpenMPI is capable to use many different transports,
so it may not be absolutely transparent what is the
particular transport in use. For instance, if 32-bit OpenIB
libraries are not properly installed on your system, you may
be using Ethernet, or, in the best case, IPoIB

You can try to run Firefly using the following command line:

mpirun --mca btl ^tcp ...other options ...

This will disable tcp component.

Alternatively, you can try

mpirun --mca btl self,sm,openib ...

This will only enable loopback, shared memory and OpenIB transport.

(you can find more information here)

>>Where the Firefly's temporary files are located?
>they are stored in the same folder as my output files.

Is this a fast distributed filesystem intended to
serve multiple concurrent I/O requests efficiently?
Can you provide some some details?

>>What is the version and build number of Firefly you are using?
>Firefly version 8.0.0, build number 6695

You may need to upgrade to the latest RC binaries. Yo can find them here:

http://classic.chem.msu.su/gran/firefly/firefly8_rc.html


Kind regards,
Alex Granovsky


[ Previous ] [ Next ] [ Index ]           Fri Mar 8 '13 11:20pm
[ Reply ] [ Edit ] [ Delete ]           This message read 854 times