Re^5: Error in calculation on two processors

Alex Granovsky


it seems you are using plain MPICH-linked Firefly binaries,
while you need mpich-mx linked ones.

Alex Granovsky

On Tue Nov 9 '10 0:10am, Eldar Mamin wrote
>I do have a permission, because i can work on 1 processor.
>cleanipcs was helpful this time.
>So file goes, but only on 1 processor instead of 2, as i
>write in command line. why?

>I attached inp, out, dat, irc files, printcreen of commands
>and my printcreen  of my work directory.
>On Sun Nov 7 '10 11:50pm, Alex Granovsky wrote

>>This is very rough Linux equivalent of the following Win32 error:

>>//  The process cannot access the file because
>>//  it is being used by another process.
>>#define ERROR_SHARING_VIOLATION          32L

>>Check write permissions of /home, /home/14082010.1407,
>>and /home/14082010.1407/storefiles. Most likely,
>>you do not have permissions to create /home/14082010.1407 etc...
>>If you do have, check for zombie or dead Firefly processes on node
>>that reports this error. The file can still be locked by one of them.

>>Alex Granovsky

>>On Sun Nov 7 '10 9:06pm, Eldar Mamin wrote
>>>Thanks, it helped

>>>But error

>>> "FSF: fatal error no. 0x00000020 in sub SEQOPN on unit  10"

>>>arose. And as i see it goes on 1 processor.

>>>out file is attached.
>>>On Sat Nov 6 '10 10:22am, Alex Granovsky wrote

>>>>cleanipcs is your friend :)

>>>>Alex Granovsky

>>>>On Thu Nov 4 '10 7:50pm, Eldar Mamin wrote
>>>>>Dear FireFly users,

>>>>>I perform calculations on cluster.
>>>>>processor:  Intel(R) Xeon(R) CPU  X5355  @ 2.66GHz

>>>>>my ff version is  7.1.G download links, Linux MPICH-MX, dynamically linked NPTL-based version

>>>>>MPICH is   mpich-mx-gcc-1.2.7

>>>>>I reserve resoures by command qsub -I -l walltime=00:30:00,nodes=1:ppn=2

>>>>>And then

>>>>> mpirun -np 2 -machinefile $PBS_NODEFILE /home/14082010.1407/firefly -i  /home/14082010.1407/De_1411sad_Zmat.inp -o /home/14082010.1407/De_1411sad_Zmat.OUT -t /home/14082010.1407/storefiles -ex /home/14082010.1407/ex/ -r -p -stdext
>>>>>p0_10215:  p4_error: semget failed for setnum: 0
>>>>>p0_10214:  p4_error: semget failed for setnum: 0

>>>>>what does it mean?

