Davide Vanossi
vanossi.davide@unimore.it
Test1 HTT on and mklnp=1
Test2 HTT on and mklnp=4
Test3 HTT off and mklnp=1
Test4 HTT off and mklnp=4
A reliable input-test file to try some conclusions is the one that correspond to Test 4 in the performance section. I attach this file to the present message.
Best Regards
Davide Vanossi
On Wed Apr 21 '10 10:27am, Stefan Boresch wrote
-----------------------------------------------
>I am using firefly 7.1.G, "Serial/parallel Linux binaries linked with MPICH, optimized for Pentium 4, Pentium D, Xeon, Intel Core 2 (Conroe/Merom/Woodcrest/Clovertown etc..., Penryn/Harpertown etc...), Intel Core i7 (Nehalem etc..) processors, as well as for AMD Phenom (tri- and four-core)/AMD Barcelona (four-core Opterons) processors."
>The OS is Ubuntu 9.10,
>Linux loop 2.6.31-20-generic #58-Ubuntu SMP Fri Mar 12 04:38:19 UTC 2010 x86_64 GNU/Linux
>Any standard jobs work fine. I have now tried to "activate" multiple
>cores, but when I set mklnp=4, the performance slows down to a crawl.
>(Geometry optimization of water, 6-31G, takes minutes instead of a
>fraction of a second on a single core; indeed in top I see 4 "fireflies", each with 20-30PU utilization...
>I also tried the standard (MPI) parallel mode (-np 2 on the command line); but with the small system it's not clear whether I run on more than one core ...
>Obviously, I am doing something wrong, but I am also wondering whether
>the documentation is up to date / fully accurate for 7.1.G?. E.g., one readme advises to set
>$smp call64=.t. $end
>It seems to me, however, that even in the absence of this statement
>the 64 bit is being used (I add relevant output from a run without any special options set).
>The real applications we have in mind have appr. 80-100 electrons, to
>be handled with 6-31G(d) or (slightly) better, and we'll work our way through plain SCF, B3LYP up to MP2. Thus, getting the most out of our quadcores would be nice.
>(For what it's worth, the machines in our cluster, where the real work will be done, are not core i7, but Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz)
>Thanks in advance,
>Stefan Boresch
>Plain input file (which runs as expected):
>
$CONTRL SCFTYP=RHF RUNTYP=OPTIMIZE COORD=UNIQUE MAXIT=100 $END $BASIS GBASIS=N31 NGAUSS=6 $END $DATA WATER, cart. coord. C1 OXYGEN 8.0 0.0000000000 0.0000000000 0.0000000000 HYDROGEN 1.0 1.4324122987 0.0000000000 1.0299006633 HYDROGEN 1.0 -1.4324122987 0.0000000000 1.0299006633 $END
>Relevant output:
>
****************************************************** *Firefly (PC GAMESS) version 7.1.G, build number 5618* * Compiled on Thursday, 26-11-2009, 20:43:46 * *Code development and Intel/AMD specific optimization* * Copyright (c) 1994, 2009 by Alex A. Granovsky, * * Firefly Project, Moscow, Russia. * * Some parts of this program include code due to * * work of Jim Kress, Peter Burger, and Robert Ponec. * ****************************************************** * Firefly Project homepage: * * http://classic.chem.msu.su/gran/firefly/index.html * * e-mail: * * gran@classic.chem.msu.su * * This program may not be redistributed without * * the specific, written permission of its developers.* ****************************************************** ****************************************************** * PARTIALLY BASED ON GAMESS (US) VERSION 6 JUN 1999, * * GAMESS (US) VERSIONS 6 SEP 2001 AND 12 DEC 2003 * * FROM IOWA STATE UNIVERSITY * * M.W.SCHMIDT, K.K.BALDRIDGE, J.A.BOATZ, S.T.ELBERT, * * M.S.GORDON, J.H.JENSEN, S.KOSEKI, N.MATSUNAGA, * * K.A.NGUYEN, S.J.SU, T.L.WINDUS, * * TOGETHER WITH M.DUPUIS, J.A.MONTGOMERY * * J.COMPUT.CHEM. 14, 1347-1363(1993) * ****************************************************** Core i7 / Linux Firefly version running under Linux. Running on Intel CPU: Brand ID 0, Family 6, Model 26, Stepping 5 CPU Brand String : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz CPU Features : CMOV, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, HTT, MWAIT, EM64T Data cache size : L1 32 KB, L2 256 KB, L3 8192 KB max # of cores/package : 8 max # of threads/package : 16 max cache sharing level : 16 actual # of cores/package : 4 actual # of threads/package : 8 actual # of threads/core : 2 Operating System successfully passed SSE support test. PARALLEL VERSION (MPICH) RUNNING IN SERIAL MODE USING SINGLE PROCESS EXECUTION OF FIREFLY BEGUN 12:56:15 LT 20-APR-2010 [snip] Warning: HTT is enabled, bitmask of physically unique cores is 0x000000F0 SMT aware parts of program will use 2 threads. Creating thread pool to serve up to 128 threads. Activating Call64 option. Using 64-bit DGEMM by default.
![]() | This message contains the 129 kb attachment [ Test4.inp ] |