The PC GAMESS SMP scaling results for model MP4-SDQ job





 
The PC GAMESS v. 5.2 SMP scaling results for the model MP4-SDQ job with 290 AOs total, 111 occupied orbitals, 52 core orbitals, 179 virtual orbitals, and no symmetry. Four-CPUs system (Pentium II Xeon 450 MHz/1MB L2 cache, 4 GB RAM, 30 GB RAID volume) running under Windows NT Server v. 4.0 SP 3 was used to perform testing. We are very grateful to Bruce S. Greer who kindly provided us with the access to this system.

 
Stage Comments Number of CPUs used
1 2 3 4
time, s MFlops scaling time, s MFlops scaling time, s MFlops scaling time, s MFlops scaling
ACCD terms a 30.61   1.00 32.21   1.00 38.31   1.00 40.76   1.00
AO integrals a 629.59   1.00 629.53   1.00 629.36   1.00 630.13   1.00
Integral transformation b, d 5943.78   1.00 4484.83   1.33 4022.10   1.48 3862.58   1.54
Q and G  terms c, d, e 20683.25 261 1.00 11509.79 469 1.80 8465.59 638 2.44 7307.93 739 2.83
Non ACCD terms c, d, e 29331.36 264 1.00 15988.64 484 1.83 11385.81 680 2.58 9510.07 814 3.08
External  exchange c, e 79700.25 82 1.00 40977.98 159 1.94 28191.78 232 2.82 21852.32 300 3.65
Total for N6 steps   129714.86 152 1.00 68476.41 288 1.89 48043.18 410 2.70 38670.32 509 3.35
Total for all stages f 136318.84  > 145 1.00 73622.98 > 268 1.85 52732.95 > 375 2.59 43203.79 > 458 3.16


Comments:
  1. This step currently performed using only one CPU.
  2. This stage is partially limited by the sequential I/O performed on single CPU and hence is not parallelized well.
  3. These are the most demanding steps with sixth order (N6) scaling.
  4. MKL-level parallelism is used.
  5. Non MKL-level parallelism is used.
  6. MFlop estimates were calculated using flops count for N6 steps only.
  7. Test input file is available upon request.

Back to the PC GAMESS/Firely Benchmarks page

Copyright © 1999 by Alex A. Granovsky