PC GAMESS/Firefly-related discussion club



Learn how to ask questions correctly


Re^5: "ifconfig -a" output on nodes running firefly with P2P on

Alex Granovsky
gran@classic.chem.msu.su


Hi,

Thanks for sending output.

Note, it cannot be absolutely the same on both nodes!
It would be interesting to look at output for the second node as well.
Anyway, your network configuration is indeed nontrivial.
What is important is that you have ipoib support on your nodes.
You can direct p2p to use ib0 adapter by setting:

 $p2p net=0a000000 mask=ff000000 $end

By default, p2p uses eth0, and it seems not to be in use on your systems.


Regards,
Alex

On Fri Jan 16 '09 4:03pm, Pasquale Morvillo wrote
-------------------------------------------------
>Hi,
>I have run firefly using 16 processors on 2 nodes (8 cpus per node).
>this is the "ifconfig -a" output (it's the same on both nodes):

>eth0      Link encap:Ethernet  HWaddr 00:1A:64:36:07:E6  
>          inet addr:172.19.1.28  Bcast:172.19.255.255  Mask:255.255.0.0
>          UP BROADCAST MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>          Interrupt:225 Memory:ea000000-ea012100

>eth1      Link encap:Ethernet  HWaddr 00:1A:64:36:07:E8  
>          inet addr:172.18.1.28  Bcast:172.18.255.255  Mask:255.255.0.0
>          inet6 addr: fe80::21a:64ff:fe36:7e8/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:294457228 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:272017025 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:88274582462 (82.2 GiB)  TX bytes:68662040912 (63.9 GiB)
>          Interrupt:193 Memory:ec000000-ec012100

>eth2      Link encap:Ethernet  HWaddr 00:10:18:25:60:E8  
>          inet addr:172.16.1.28  Bcast:172.16.255.255  Mask:255.255.0.0
>          inet6 addr: fe80::210:18ff:fe25:60e8/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:16137403 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:211846 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:1149771856 (1.0 GiB)  TX bytes:21408286 (20.4 MiB)
>          Interrupt:233 Memory:f0000000-f0012100

>ib0       Link encap:InfiniBand  HWaddr 80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
>          inet addr:10.10.5.28  Bcast:10.255.255.255  Mask:255.0.0.0
>          inet6 addr: fe80::205:ad00:c:1605/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
>          RX packets:4085321 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:3529427 errors:0 dropped:230 overruns:0 carrier:0
>          collisions:0 txqueuelen:256
>          RX bytes:127251462842 (118.5 GiB)  TX bytes:83322085457 (77.5 GiB)

>lo        Link encap:Local Loopback  
>          inet addr:127.0.0.1  Mask:255.0.0.0
>          inet6 addr: ::1/128 Scope:Host
>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>          RX packets:38931408 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:38931408 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:2330651332 (2.1 GiB)  TX bytes:2330651332 (2.1 GiB)

>sit0      Link encap:IPv6-in-IPv4  
>          NOARP  MTU:1480  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
>
>-------------------------------------------------------------------------------------------

>regards
>
>
>On Fri Jan 16 '09 2:31pm, Alex Granovsky wrote
>----------------------------------------------
>>Hi,

>>most often, p2p init failures are caused either by
>>network mis-configuration problems on some nodes,
>>or by significant differences in network configuration
>>across nodes. These problems can be easily solved using
>>advanced p2p options. However, it is necessary to look
>>at your network setup. I'd ask you to attach "ifconfig -a"
>>output on the nodes you are trying to run Firefly with
>>p2p turned on, to this thread.  

>>Regards,
>>Alex Granovsky
>>
>>
>>On Thu Jan 15 '09 5:10pm, Pasquale Morvillo wrote
>>-------------------------------------------------
>>>Hi,

>>>>1. Check if there are some other processes consuming CPU resources.

>>>No other processes running.

>>>>2. Use of dynamic load balancing over P2P interface is a MUST!

>>>When I use the P2P interface ($P2P P2P=.T. DLB=.T. $END) I got the following error:

>>> Loading P2P interface library... loaded successfully (version 1.A).
>>> Initializing P2P interface... init failed!

>>>and the execution is stopped.
>>>
>>>
>>>regards


[ Previous ] [ Next ] [ Index ]           Fri Jan 16 '09 4:25pm
[ Reply ] [ Edit ] [ Delete ]           This message read 1294 times