I'm sorry for delay on my side as well. I have a couple of comments
concerning your observations.
First, you are absolutely correct that the use of separate HDDs
is important only for disk-intensive jobs.
Second, if you disabled httfix, it is better to run Firefly
in parallel using all logical cores.
Finally, it is better to set MKLNP, NP, and HTTNP at their default
values. The explicit use of these variables is important mainly
for MP4 and MCQDPT2/XMCQDPT2 computations. Otherwise, their use
may cause performance degradation rather than increase.
In particular, if you leave them intact, you'll most likely
get speedup close to 4x on your test system.
On Thu Jun 27 '13 5:48pm, Fumihito Mohri wrote
>Very sorry for the delay
>In order to test MKLNP, HTTNP and httfix for the case of dirscf=.t.
>I used Firefly71G and a PC(Win7) having the following CPU (This PC is different one
> for the title of " 8 cores +HTT").
>CPU Brand String : Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
> CPU Features : CMOV, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2,
>HTT, MWAIT, EM64T
>actual # of cores/package : 4
>actual # of threads/package : 8
>actual # of threads/core : 2
>From this list, it is known that the number of physical cores is 4,
>then KMLNP=4 and HTTNP=2. Moreover, I chose httfix=.f. Next, I made four
> nodes on the c drive.
> c:\node1 c:\node2 c:\node3 c:\node4
>Under the above condition, I confirmed that wall-clock time reduced to
> less than a third (i.e.1/3) of the single-node calculation, although
>the four nodes were generated on the one HD. This is unexpected matter
>for me, because I have recognized that parallel calculations of Firefly
>require a separate HD for each node. Maybe, use of separate HD is necessary
> for the case of dirscf=.f. (i.e disk I/O mode).