Ceilidh ... Re^3: Geometry optimizations using the SAME input file produce slightly different results when running in parallel mode

Firefly and PC GAMESS-related discussion club

Re^3: Geometry optimizations using the SAME input file produce slightly different results when running in parallel mode

Alex Granovsky
gran@classic.chem.msu.su

Dear Davide,

consider the following:

1. The algorithms are somewhat different when running in parallel
as compared with serial runs.

2. Running in parallel using dynamic load balancing mode,
each run is unique as the exact execution sequence will
generally vary.

3. Even with static load balancing, two parallel runs still can
produce somewhat different results because MPI standard does not
enforce any particular requirements on how collective operations
should be implemented - and thus things like global sums etc..
can be in theory calculated somewhat differently (i.e., using
different order of summation) depending on the particular MPI
implementation

4. SCF energy is completely variational. This means that it is
very stable with respect to things like variations in accumulation
of round-off errors. Things like MP2 energy, any energy gradients,
etc... etc... are much more sensitive because they are not variational.

5. Geometry optimization is the most sensitive part as on complex
PES, as even minor difference in computed gradients may result in
seriously different step taken by optimizer - and hence, there are
indeed some cases when optimization can converge even to different
minima on PES running the same job with different number of processes.

6. Finally, in some rare and exotic situations, there is still
some probability that even running the same job repeatedly in
serial, one can get slightly different answers - although this
is usually unlikely event. One of the reasons is that Firefly
uses run-time calibration/selection of some code (e.g., dynamic
selection of two-electron integrals code) to achieve the best
possible performance - and sometimes calibration can give
different answers depending on other processes executed by OS
as well as on the overall OS load...

Hope this helps.

Regards,
Alex

On Mon Jan 11 '10 3:51pm, Davide Vanossi wrote
----------------------------------------------
>Dear Pedro, actually I’m not worried at all :-).
>I am just curious considering that running on a single core leads to identical output files (for different runs started with the same input file). The round-off errors seem to manifest themselves only for parallel runs when runtyp=optimize or runtyp=gradient. For other type of tasks I didn’t find any difference also running in parallel mode.
>Anyway thank you again for your kind reply.
>Regards.

> Davide Vanossi

>On Mon Jan 11 '10 2:21pm, Pedro Silva wrote
>-------------------------------------------
>>An energy difference on the 9th decimal place is not relevant, neither is a problem on the 4th decimal place of the dipole moment. I guess those differences are simply caused by roundoff errors, and I would certainly not worry about them

>>Pedro S.
>>

[ This message was edited on Tue Jan 12 '10 at 11:52am by the author ]

Tue Jan 12 '10 11:52am

This message read 1269 times