MKLNP controls the number of additional dedicated threads used by
DGEMM and other BLAS level 3 code. This code is not necessary MKL-
based so the name MKLNP retains mainly due to historical reasons.
For example, MKLNP=2 creates a single dedicated thread ("MKL thread")
to be used by DGEMM. Each call to DGEMM will use two threads
(the main Firefly's thread and the dedicated "MKL thread").
Similarly MKLNP=4 creates three dedicated "MKL threads".
Each call to DGEMM will then use four threads (the main Firefly's
thread and three dedicated "MKL threads").
Outside of DGEMM or other BLAS level 3 code "MKL threads" are
suspended and do not interact with Firefly's execution.
As Firefly's code does not use huge matrices it is not recommended
to use large values for MKLNP. Normally, MKLNP=2 or 4 is enough.
NP controls the number of additional working threads to be used by
threaded sections of Firefly's code. The most important of these
sections are triples part of MP4 code and summation of PT series in
(X)MCQDPT2 code. Note, there is not OMP-based code in Firefly, as
Firefly uses OS threading API directly.
Unlike MKLNP, it is usually beneficial to use large values for NP,
up to the overall number of the physical CPU cores available. (If
there are several logical cores per physical core available, check
the description of httnp variable in $SMP group. QDPT code benefits
of the use of additional logical cores).
On Tue Mar 1 '16 3:39pm, GrEv wrote
>Could someone, who knows, give some more details about the
>MKLNP and NP options, possibly using some example.
>Does MKLNP control the number of MKL threads and NP define the
>number of OMP threads? What is the best MKLNP:NP ration in
>the case of XMCQDPT2 calculations?