pruthvish
pruthvi.19@gmail.com
i am sorry for not being able to reply earlier. i would like to convey my thanks to Alex for giving me valuable advice for my problem which was very helpful. My systems have indeed been giving me better scalability.
thank you,
regards,
Pruthvish R
On Sat May 12 '12 0:43am, Alex Granovsky wrote
----------------------------------------------
>Hi again,
>sorry I did not realize you are using different input files running
>in serial and in parallel so p2p is already here while running in
>parallel. Additional examination of your outputs has revealed some
>really weird things.
>E.g., at first geometry:
>
Serial job ----------------- DENSITY CONVERGED ----------------- TIME TO FORM FOCK OPERATORS= 603.2 SECONDS ( 46.4 SEC/ITER) OF THE ABOVE TIME, DFT PART= 146.9 SECONDS ( 11.3 SEC/ITER) FOCK TIME ON FIRST ITERATION= 61.3, LAST ITERATION= 30.6 TIME TO SOLVE SCF EQUATIONS= 2.9 SECONDS ( 0.2 SEC/ITER) FINAL ENERGY IS -803.1125408004 AFTER 13 ITERATIONS DFT EXCHANGE + CORRELATION ENERGY IS -91.4084876724 INTEGRATED TOTAL ELECTRON NUMBER IS 129.9999704942
Parallel job ----------------- DENSITY CONVERGED ----------------- TIME TO FORM FOCK OPERATORS= 199.5 SECONDS ( 15.3 SEC/ITER) OF THE ABOVE TIME, DFT PART= 28.1 SECONDS ( 2.2 SEC/ITER) FOCK TIME ON FIRST ITERATION= 17.8, LAST ITERATION= 12.9 TIME TO SOLVE SCF EQUATIONS= 151.1 SECONDS ( 11.6 SEC/ITER) FINAL ENERGY IS -803.1125408004 AFTER 13 ITERATIONS DFT EXCHANGE + CORRELATION ENERGY IS -91.4084876724 INTEGRATED TOTAL ELECTRON NUMBER IS 129.9999704942
>For instance, let's look at DFT times:
>Serial: DFT PART= 146.9 SECONDS ( 11.3 SEC/ITER)
>Parallel: DFT PART= 28.1 SECONDS ( 2.2 SEC/ITER)
>And these numbers are quite reasonable.
>At the same time:
>Serial: TIME TO SOLVE SCF EQUATIONS= 2.9 SECONDS (0.2 SEC/ITER)
>Parallel: TIME TO SOLVE SCF EQUATIONS= 151.1 SECONDS (11.6 SEC/ITER)
>and this is really weird. This step is basically the work performed
>by the fastdiag.ex extension. The slowdown at this stage is the real
>reason why you are getting such a poor scalability.
>These numbers suggest that fastdiag.ex is probably missed on at
>least one of the slave nodes or the wrong path was specified
>to the extension files directory using -ex command line switch.
>I'd suggest to double check this ans to run Firefly again adding -prof
>command line option. This will profile Firefly in real-time and
>provide useful information on time spent in diagonalization and
>other parts of code including communications.
>Kind regards,
>Alex Granovsky
>
>
>
>On Thu May 10 '12 1:29am, Alex Granovsky wrote
>----------------------------------------------
>>Hi,
>>add
>>
$p2p p2p=.t. dlb=.t. $end
>>to your input file. What is the interconnect between nodes?
>>Kind regards,
>>Alex Granovsky
>>
>>
>>
>>On Wed May 9 '12 11:40pm, pruthvish wrote
>>-----------------------------------------
>>>hello,
>>> Dear Sanya and Alex, Thank you for the kind reply. i did follow your suggestions and saw improvement in the performance in the system. but i still am not satisfied with it. the time reduction from a serial run to the parallel run with four nodes with two cores each for a large molecule with the implementation of the told suggestions was less than half. i am attaching the output file of the sample run. i would be grateful if anyone can advise me on how to better increase
>>>the performance of the existing system or if i should simply increase the number of nodes to improve the calculation speeds.
>>>Thanking you in advance.
>>>With best regards,
>>>Pruthvish.
>>>On Wed May 9 '12 8:01pm, Alex Granovsky wrote
>>>---------------------------------------------
>>>>Hi,
>>>>Sanya is absolutely correct in that *.ex files (which was not used
>>>>in your sample so that Firefly complains on the missed fastdiag and
>>>>tuned dgemm) are important for good scalability, as well as the use
>>>>of dynamic load balancing over P2P interface.
>>>>Another important point is that your test job is simply by far
>>>>too small to run it in parallel on more that say two or four cores.
>>>>Try to run larger job.
>>>>Kind regards,
>>>>Alex Granovsky
>>>>
>>>>
>>>>On Wed May 9 '12 4:49pm, sanya wrote
>>>>------------------------------------
>>>>>Probably, the problem is in the following diagnostics:
>>>>>Processor-specific dynamic link DGEMM library code not loaded.
>>>>> Using built-in DGEMM code instead.
>>>>> Warning: running without fastdiag runtime extension!
>>>>>Probably, adding $SMP and $P2P groups may help