Thomas Patko
tpatko@gmail.com
On Wed Dec 24 '08 5:55pm, Pasquale Morvillo wrote
-------------------------------------------------
>Hi,
>I used the mpich version (statically linked) with 8, 16 and 32 cpus (8 cpus per node). Going from 8 to 16 and 32 cpu, often the execution time increases!
How is your scaling from 4 cpu to 8 cpu when running on a single node?
Have you tried to run just 4 cpu per node to see if this gives better scaling?
This issue has been reported for some types of Dual Quad systems with other software (such as AMBER) when there are issues with memory and/or FSB bandwidth being too constrained for the number of processing cores per node.
What type of Xeons or Opertons are your running? �I assume that your interconnects are Gigabit.
I assume that you are not hitting any memory swapping issues when using 8cpu per node (this would certainly be an explanation for memory degradation).
Cheers,
Thomas