Firefly and PC GAMESS-related discussion club


 
Learn how to ask questions correctly  
 
 
We are NATO-free zone
 



Re^3: Memory allocation problem of XMCQDPT2 in Firefly8

Dawid
dawid.grabarek@pwr.edu.pl


Dear Firefly Users,

I would like to come back to this thread. I use Firefly 8.2.0 on
Scientific Linux system. I encounter a similar issue as Panwang
described, however neither preallocation nor using the heap memory
helped.

I contacted my computing centre administrator and he replied that
our queue batching system performs automatic preallocation of
memory to check whether required memory is actually available to
go over problems like the ones described. Anyway, he says that
in case of Firefly and my input, this preallocation worked correctly
and this can be Firefly-related issue. Is there a way to make Firefly
print more details on this memory allocation error?

Best wishes,
Dawid Grabarek

On Thu Oct 24 '13 10:08am, Panwang Zhou wrote
---------------------------------------------
>Dear Alex,

>Thanks for your reply.

>I have done the test jobs with the command line option "-prealloc:485" and it works, thanks.
>The number of AOs in my systems is 359. It is a good news that the beta of Firefly v. 8.0.1 which has a somewhat reduced memory demands for XMCQDPT2 code.

>On Wed Oct 23 '13 1:34am, Alex Granovsky wrote
>----------------------------------------------
>>Dear Panwang,

>>This is not a bug.

>>Actually, 480 MW is rather close to the limit Firefly can allocate
>>when running under Linux. Note, the memory allocated by Firefly
>>must be formed by a single continuous address range in the virtual
>>address space. It is not always possible to allocate such a huge
>>piece of continuous memory as there is some randomness in the way
>>how Linux loads shared libraries and their data segments.

>>In addition, Firefly normally allocates memory after MPI
>>initialization so that any memory fragmentation resulted from MPI init
>>can have negative impact on the largest amount of memory available to
>>Firefly.

>>I have no idea why you do not see this effect with Firefly RC 40,
>>one of the possible explanations could that the the smaller size
>>of the older Firefly's executable images increases the probability
>>to allocate exactly 480 MW.

>>With Firefly version v. 8.0.0, the thing that can help is the following
>>command line option:

>>

./firefly8 -prealloc:485   other options

>>This will try to pre-allocate 485 MW in the virtual address space
>>at the very beginning of the job initialization. There is no warranty
>>that the pre-allocation will be successful though.

>>Another way is to use a bit less memory, say 479 MWords or so.

>>Finally. how much is the typical number of AOs in the systems
>>you are modeling? If it is large, I can provide you the current
>>beta of Firefly v. 8.0.1 which has a somewhat reduced memory
>>demands for XMCQDPT2 code.

>>Kind regards,
>>Alex Granovsky
>>
>>
>>
>>On Mon Oct 21 '13 9:16am, Panwang Zhou wrote
>>--------------------------------------------
>>>Dear Alex,

>>>It seems that there is bug in Firefly version 8.0.0 Linux/MPICH2, dynamically linked version for memory allocation in XMCQDPT2 calculations.

>>>I have run a series of jobs with XMCQDPT2 using Firefly 8.0.0 with the following input:
>>> $CONTRL SCFTYP=MCSCF RUNTYP=ENERGY EXETYP=RUN MAXIT=50 ICHARG=-1
>>>    MULT=1 FSTINT=.T. GENCON=.T. INTTYP=HONDO NOSYM=1 COORD=ZMT
>>>    ICUT=11 ITOL=30 WIDE=.T. MPLEVL=2 $END
>>> $SYSTEM MWORDS=480 TIMLIM=60000.0 KDIAG=0 NOJAC=100 $END
>>> $SYSTEM MKLNP=1 NP=12 $END
>>> $SMP SMPPAR=.T. HTTNP=1 $END
>>> $SCF DIRSCF=.T. FDIFF=.F. NCONV=8 $END
>>> $P2P P2P=.T. DLB=.T. $END
>>> $TRANS MPTRAN=2 DIRTRF=.T. AOINTS=DIST ALTPAR=.T. MODE=112 $END
>>> $MCSCF CISTEP=ALDET FULLNR=.F. SOSCF=.T. MAXIT=100 $END
>>> $MCSCF IFORB=.T. $END
>>> $DET NCORE=49 NACT=14 NELS=16 NSTATE=6 WSTATE(1)=1,1 DISTCI=12 $END
>>> $XMCQDPT NSTATE=2 EDSHFT=0.02 THRGEN=1D-12 MXBASE=90 $END
>>> $XMCQDPT HALLOC=.T. $END
>>> $XMCQDPT IFORB(1)=-1,1,1 WSTATE(1)=1,1,-0 AVECOE(1)=1,1,-0 $END
>>> $BASIS GBASIS=N31 NGAUSS=6 NDFUNC=1 NPFUNC=1 DIFFSP=.TRUE. $END
>>> $GUESS GUESS=MOREAD NORB=359 $END
>>> $MCQFIT $END

>>>However, the jobs terminated randomly with the following errors:

>>> FATAL ERROR:      4 PROCESS(ES) FAILED TO ALLOCATE MEMORY
>>> PROCESS     1 FAILED TO ALLOCATE MEMORY, ERROR CODE:       14
>>> PROCESS     5 FAILED TO ALLOCATE MEMORY, ERROR CODE:       14
>>> PROCESS     7 FAILED TO ALLOCATE MEMORY, ERROR CODE:       14
>>> PROCESS     9 FAILED TO ALLOCATE MEMORY, ERROR CODE:       14

>>>the number of failed processes is random, maybe 4, 3, 2 or 1, and the job can be terminated normally after I submit the job several times.

>>>However,when I run these jobs with Firefly 8 Beta 40, no errors occurred and all the jobs terminated normally.
>>>


[ Previous ] [ Next ] [ Index ]           Mon Aug 21 '17 5:30pm
[ Reply ] [ Edit ] [ Delete ]           This message read 65 times