Alex Granovsky
gran@classic.chem.msu.su
what is the output of "ulimit -a" command on the nodes you are using?
Working in standard mode, P2P uses two sockets per each peer,
while in XDLB mode, there are four sockets per each peer.
Thus, if you run Firefly in parallel on 256 nodes, each Firefly's instance will need either 512 or 1024 open sockets (file descriptors).
The typical default limit on most Linux installation is 1024 open file descriptors per user.
You may need to check the system-wide limit as well.
These limits can be raised as explained e.g. at:
http://www.cs.uwaterloo.ca/~brecht/servers/openfiles.html
Internally, current builds of Firefly are limited to 2048 sockets,
but we can recompile it for larger number if required.
Hope this helps,
Alex
On Thu Jul 2 '09 6:59pm, Pasquale Morvillo wrote
------------------------------------------------
>Hi,
>the crash is after "..... DONE SETTING UP THE RUN ....."
>The test file for mp2 method is the one found in the performance section (test3.inp)
>regards
>
>
>On Thu Jul 2 '09 5:45pm, Igor Polyakov wrote
>--------------------------------------------
>>Pasquale, hello,
>>Actually i just wanted to ask what kind of a crash did u experience? Crash in the start or hangups during calculation? Maybe u could supply us the input/ouput file used...
>>Best regards, Igor
>>
>>
>>On Wed Jul 1 '09 9:40pm, Pasquale Morvillo wrote
>>------------------------------------------------
>>>Dear Alex,
>>>I made some tests (mp2 method) running Firefly (OpenMPI) on our cluster (each node is a 4 Xeon Quad-Core Tigerton E7330).
>>>All the tests using until 256 CPUS run fine.
>>>Using more than 256 CPUS, there is a crash in the application.
>>>All the tests were done using:
>>>$p2p p2p=.t. dlb=.t. xdlb=.t. $end
>>>If xdlb is set to false, it was possible to run the tests with more than 256 CPUS (I was able to use more than 400 CPUS).
>>>Any hints?
>>>regards