Firefly and PC GAMESS-related discussion club



Learn how to ask questions correctly


Re: How to start running Firefly on computer cluster?

Alex Granovsky
gran@classic.chem.msu.su


Hi Olga,

perhaps, the most serious problem here is the Cleo Job Manager itself,
it is a bit strange, at least to me. :-)
Did your cluster administration ever consider migrating to PBS?

>a) I create 3 work directories – on master node and on each compute nodes with the same name, e. g. mywork/home_ff.

That's a bad idea.

1. Your home directory is most likely shared across all nodes
(and most likely with the same name) so you do not need to
create extra copies of FF and *.ex files, as Sanya pointed
out already.

The working directories should usually be on the the local
file system. E.g., you can use /tmp as the scratch directory.
Some clusters may have additional dedicated scratch dirs
(e.g. /scratch or whatever). Ask your system administrator.  

And in most cases, it is much more convenient to use -t switch
to specify the root prefix for working directories. This way,
you do not need to worry on creating them manually, and then
properly pass the list to Firefly. See the documentation on
command line options for details.


>b) Then I put into each of them FF binaries.

That's not needed actually.

> Preliminary I copy corresponding file from BINDINGS folder into installation directory and rename it to mpibind.dll.

This is not needed under Linux - the bindings are specific to Windows version of Firefly.

>Q.1: Here I have my first question: on the cluster there are the several mpi libraries, specifically, openmpi v. 1.3 (not v. 1.2!) and mvapich-1.1.0, mvapich2-1.2p1. What of them is better for using?

Use mvapich-1.1.0. You'll need 32-bit shared libraries.
Perhaps, mpi-selector will be helpful to you to switch
between different MPIs.

>mpirun -np 4 /mywork/home_ff/pcgamess -r -f -p -i ./bench01.inp -o /my_work/outs/bench01.out -ex /mywork/home_ff -t /tmp/run_ff

You you want to run on 16 cores, you need to specify (with Cleo):

 mpirun -np 16...


E.g.:

 mpirun -np 16 /home/gran/ff/firefly -r -f -p -i /home/gran/inputs/bench01.inp -o /home/gran/outs/bench01.out -ex /home/gran/ff -t /tmp/run_ff

>Q.5: In what directory the -ex option copies the fastdiag.ex and pcgp2p.ex files? Into /tmp/run_ff?

Into /tmp/run_ff.0, /tmp/run_ff.1, etc...

>Q.6: If I'll receive for run one of nodes not entirely (but one processor only), what I should change in this case?

You do not need to do anything special - it's the responsibility
of batch queuing system to handle this properly.

Regards,
Alex Granovsky


On Fri Mar 5 '10 12:43pm, Olga wrote
------------------------------------
>Hi!
>I’m a student of Petrozavodsk University, and I use FF for my work. Now I have the opportunity to use computer cluster of Karelian Branch of RAS (http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/http://cluster.krc.karelia.ru/) for my tasks. Cluster has 10 nodes, each of which has 2 CPU Quad-Core Intel Xeon 5430, OS - SuSE Linux Enterprise Server 10 with Cleo Job Manager. Number of nodes, free to use with Firefly, changes from run to run. I hope to use about 2-5 nodes. At the end of each run I must to delete "traces of my stay" on compute nodes.

>Please, help me to run my task for the first time. After studying tutorials I have made the following plan of starting. What you think about it?
>     Assume I have 2 nodes for my task (4 processors, 16 cores).
>a) I create 3 work directories – on master node and on each compute nodes with the same name, e. g. mywork/home_ff.
>b) Then I put into each of them FF binaries. Preliminary I copy corresponding file from BINDINGS folder into installation directory and rename it to mpibind.dll.
>Q.1: Here I have my first question: on the cluster there are the several mpi libraries, specifically, openmpi v. 1.3 (not v. 1.2!) and mvapich-1.1.0, mvapich2-1.2p1. What of them is better for using?
>Q.2: What dll from BINDINGS folder is suitable for OpenMPI?
>copy the appropriate dll from BINDINGS folder into the PC GAMESS installation directory and rename it to mpibind.dll
>c) After that I create a command line for FF run in batch_file placed in my_work/inputs directory where there is input file also:
>mpirun -np 4 /mywork/home_ff/pcgamess -r -f -p -i ./bench01.inp -o /my_work/outs/bench01.out -ex /mywork/home_ff -t /tmp/run_ff
>d) Finally I run batch_file from Cleo job manager…
>Q.3: What is incorrect here?
>Q.4: Is the np 4 parameter correctly set? Does FF use all 16 cores for calculation?
>Q.5: In what directory the -ex option copies the fastdiag.ex and pcgp2p.ex files? Into /tmp/run_ff?
>Q.6: If I'll receive for run one of nodes not entirely (but one processor only), what I should change in this case?
>Thanks in advance!

>

[ This message was edited on Wed Mar 10 '10 at 1:35am by the author ]


[ Previous ] [ Next ] [ Index ]           Wed Mar 10 '10 1:35am
[ Reply ] [ Edit ] [ Delete ]           This message read 1051 times