[Lustre-discuss] How do you make an MGS/OSS listen on 2 NICs?

Herb Wartens wartens2 at llnl.gov
Thu Jan 17 13:59:46 PST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Isaac,
My mistake.  I was thinking this issue was similar to an lctl issue that I have been seeing for
quite a while now, but as you say this is not the case since the node only has a single NID.  I
just oversimplified the problem. The case that I am referring to is what I think is a bug in the
lctl ping code.

Here is an example below of what I was referring to:

Node1:
ilc6 a lustre server that has two separate ethernet devices eth2 and eth3

# ilc6 /root > cat /etc/modprobe.conf
options lnet networks="tcp0(eth2,eth3)" \
        routes="elan0 172.16.3.[4-6]@tcp0"

# ilc6 /root > lctl list_nids
172.16.101.6 at tcp

Node2:
adev4 is a lustre router that has two separate ethernet devices and and elan device

# adev4 /root > cat /etc/modprobe.conf
options lnet networks="tcp0(eth0,eth1),elan0" \
             forwarding="enabled"

# adev4 /root > lctl list_nids
172.16.3.4 at tcp
4 at elan

Node3:
adev3 is a lustre client with only an elan device

# adev3 /root > lctl list_nids
3 at elan


Now the actual problem here is that
1) ilc6 can only successfully issue an lctl ping to the tcp nid even though it knows
   how to get to the elan0 network.
2) adev3 can only successfully issue an lctl ping to the elan nid even though it knows
   how to get to the tcp0 network.

FROM Node1:
# ilc6 /root > lctl ping 172.16.3.4 at tcp0
12345-0 at lo
12345-172.16.3.4 at tcp
12345-4 at elan

# ilc6 /root > lctl ping 3 at elan
12345-0 at lo
12345-3 at elan

ERROR:
# ilc6 /root > lctl ping 4 at elan
failed to ping 4 at elan: Input/output error

FROM Node3:
# adev3 /root > lctl ping 4 at elan
12345-0 at lo
12345-172.16.3.4 at tcp
12345-4 at elan

# adev3 /root > lctl ping 172.16.101.6 at tcp
12345-0 at lo
12345-172.16.101.6 at tcp

ERROR:
# adev3 /root > lctl ping 172.16.3.4 at tcp
failed to ping 172.16.3.4 at tcp: Input/output error

This is the error I was mistakenly trying to describe yesterday.

Isaac Huang wrote:
> On Wed, Jan 16, 2008 at 11:23:30AM -0800, Herb Wartens wrote:
> Andrew,
> I have not used lustre-1.6.4.X yet, but in previous versions (and most likely the version you are using)
> Lustre actually listens on all interfaces no matter what you specify in the modprobe.conf.  You can verify this
> by looking at the netstat output for port 988 and look for what ports you are listening on.  We here at LLNL
> regularly use multiple interfaces.
> I believe that the issue you are referring to is a bug in the lctl ping code where the ping only responds
> over the first network device specified for a particular lnd.  As long as you have properly configured your
> host routes so that you can ping both interfaces from the other node you should be fine.  IMHO this should
> just be fixed in lnet so you can do an lctl ping from any endpoint to any other endpoint.
> 
>> I don't think it's a lctl ping bug.
> 
> # ilc6 /root > cat /etc/modprobe.conf
> options lnet networks="tcp0(eth2,eth3)"
> 
>> This config gives the node only one NID: ip_of_eth2 at tcp0. You can
>> verify it by 'lctl list_nids' on the node.
> 
> # ilc6 /root > netstat -a -t -n | grep 988 | grep LIST
> tcp        0      0 0.0.0.0:988                 0.0.0.0:*                   LISTEN
> 
> # ilc6 /root > cat /etc/hosts | grep ilc7
> 172.16.101.7     ilc7-lnet0   ilc7-eth2
> 172.16.102.7     ilc7-lnet1   ilc7-eth3
> 
> # ilc6 /root > lctl ping 172.16.101.7 at tcp0
> 12345-0 at lo
> 12345-172.16.101.7 at tcp
> 
>> When you lctl ping a node at any one of its NIDs, the ping reply
>> contains a list of all NIDs of the node. As can be seen from the reply
>> above, 172.16.101.7 at tcp0 has two NIDs: 0 at lo and 172.16.101.7 at tcp.
> 
>> So when you tried 'lctl ping 172.16.102.7 at tcp0', the ping request
>> could reach 172.16.102.7, but it was rejected since 172.16.102.7 at tcp0
>> was not one of the node's NIDs.
> 
>> The socklnd does interface bonding transparently from lnet's
>> perspective. It exchanges a list of IPs of all NICs under a lnet NID
>> with peers, and creates connections to all IPs of a peer and thus
>> aggregates bandwidth. Lnet has no knowledge of this - all it sees is
>> just one NID, i.e. ip_of_1st_nic at tcp.
> 
>> Isaac
> 
> # ilc6 /root > lctl ping 172.16.102.7 at tcp0
> failed to ping 172.16.102.7 at tcp: Input/output error
> 
> # ilc6 /root > ping -c 1 172.16.101.7
> PING 172.16.101.7 (172.16.101.7) 56(84) bytes of data.
> 64 bytes from 172.16.101.7: icmp_seq=1 ttl=64 time=0.143 ms
> 
> --- 172.16.101.7 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms
> # ilc6 /root > ping -c 1 172.16.102.7
> PING 172.16.102.7 (172.16.102.7) 56(84) bytes of data.
> 64 bytes from 172.16.102.7: icmp_seq=1 ttl=64 time=0.094 ms
> 
> --- 172.16.102.7 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.094/0.094/0.094/0.000 ms
> 
> 
> Lundgren, Andrew wrote:
>>>> So the only way to use two nics at once is to bond?  I am more for redundancy rather than increased throughput.
>>>>
>>>>> -----Original Message-----
>>>>> From: He.Huang at Sun.COM [mailto:He.Huang at Sun.COM]
>>>>> Sent: Wednesday, January 16, 2008 6:34 AM
>>>>> To: Lundgren, Andrew
>>>>> Cc: 'Lustre-discuss at clusterfs.com'
>>>>> Subject: Re: [Lustre-discuss] How do you make an MGS/OSS
>>>>> listen on 2 NICs?
>>>>>
>>>>> On Tue, Jan 15, 2008 at 10:28:33AM -0700, Lundgren, Andrew wrote:
>>>>>>    I am running on CentOS 5 distribution without adding any
>>>>> updates from
>>>>>>    CentOS. I am using the lustre 1.6.4.1 kernel and software.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    I have two NICs that run though different switches.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    I have the lustre options in my modprobe.conf to look like this:
>>>>>>
>>>>>>
>>>>>>
>>>>>>    options lnet networks=tcp0(eth1,eth0)
>>>>>>
>>>>> This way of interface bonding is now a deprecated lnet
>>>>> feature. Please refer to:
>>>>> http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTM
>>>>> L-13-1.html
>>>>>
>>>>> Isaac
>>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at clusterfs.com
>>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFHj8/SP/62XqEEbMYRCtrrAKC4q4EWSdmjKmLaR9itrEoa4gdd0gCgn32S
OI4G8yg8Czvy1lsLNYHqBcY=
=R7ZB
-----END PGP SIGNATURE-----




More information about the lustre-discuss mailing list