Firefly and PC GAMESS-related discussion club


 
Learn how to ask questions correctly  
 
 
We are NATO-free zone
 



MP4(STDQ) with CUDA

Ivan Fedyanin
octy@xrlab.ineos.ac.ru


Dear all,

I'd like to test Firefly/CUDA performance in the MP4(SDTQ) calculation on a Linux Core i7 workstation with 6 cores (+6 via HT?) , 24 GiB RAM and three GTX 480 cards. I've already done a calculation using CPU only, with memory=450000000 and 6 processes via MPI that took app. 2 hours with the latest 8.x beta version of Firefly.

Now I'm a doubt what should I use for mklnp, np and/or other keys, even undocumented, to get better performance and to use GPU cards is shared mode(?). As far as I understand, some parts of the input should look like

$cuda cumask=7 $end
$smp httnp=1(2?) cuda=.t. $end

Also, I believe that the usage of parallel execution here is crucial, because more memory may be allocated and integral transformations may be done in fewer passes. Is this true? If I use mklnp=9 np=6 and run in serial mode, CPU usage is actually 600% but I haven't got energy output after waiting for 4 hours.

The system has NCORE= 10 NOCC= 38 NAOS=  366 if it matters.


[ Previous ] [ Next ] [ Index ]           Mon Apr 22 '13 4:31pm
[ Reply ] [ Edit ] [ Delete ]           This message read 1083 times