Roman Kroik
chemistnn@gmail.com
mpiexec -n 140 Firefly8.exe -r -f -p -stdext -daf 2 -prof -i test.inp -o test.out,
i.e. I start my calculations with this web-interface.
And here is a list of problems with which I encounter:
1. Often my calculation crashes because of some node's errors, for example:
job aborted:
[ranks] message
[0-40] terminated
[41] process exited without calling finalize
[42] terminated
[43] process exited without calling finalize
[44-103] terminated
[104-105] process exited without calling finalize
[106] terminated
[107] process exited without calling finalize
[108-139] terminated
---- error analysis -----
[41,43] on S-CW-NODE15
\\s-cw-head\metacluster_tasks\185\Firefly8.exe ended prematurely and may have crashed. exit code -1
[104-105,107] on S-CW-NODE41
\\s-cw-head\metacluster_tasks\185\Firefly8.exe ended prematurely and may have crashed. exit code -1
---- error analysis -----
2. I see the problem with calculations speed. For example, some calculation takes time of 5 min on the Intel Core i7 CPU with enabled HT, but on the HPC 2008 Cluster this calculation takes time of 20-30 min with 40-140 Xeon CPUs cores. I think, it is a problem with parallelism.
What do You think about these problems? Is it fault of Cluster's settings or fault of wrong Firefly's settings?
And I'm attaching example input file which I try to calculate on the cluster.
This message contains the 3 kb attachment [ test_2.inp ] test |
[ This message was edited on Mon Apr 8 '13 at 11:08pm by the author ]