[Lustre-discuss] Dual NICs issue -- How to enforce Lustre to use the second NIC

Daneil Goodman daneil.goodman at gmail.com
Thu Nov 12 08:21:41 PST 2009


On Wed, Nov 11, 2009 at 9:20 PM, Isaac Huang <He.Huang at sun.com> wrote:

> On Wed, Nov 11, 2009 at 04:07:39PM -0600, Daneil Goodman wrote:
> >    Hello list,
> >    By searching the archive, I found a similar message dated back in
> >    January 2008 -- How do you make an MGS/OSS listen on 2 NICs? Looks
> like
> >    there is no final solution and I am facing the similar situation and
> >    need your help.
> >    I am running centos 5 on both server (MGS, MDS and OSS are on same
> >    node) and clients: 2.6.18-128.1.6.el5_lustre.1.8.0.1smp. To simplify
> >    the issue, suppose the network is consist of one lustre server node
> and
> >    two lustre client nodes. The server node has two NICs: eth0(100Mb) and
> >    eth1(1Gb), each client node only has one NIC:eth0. The network layout
> >    is as below.
> >    Server node eth0: 72.203.10.1 (Public network)    <==> Switch1 <==>
> >    Public node eth0:  72.203.10.2 (Public network)
> >    Server node eth1: 192.168.10.1 (Internal network) <==> Switch2 <==>
> >    Private node eth0: 192.168.10.2 (Internal network)
> >    Both SELinux and Fireware are turned off. Public node does not know
> >    Private node, but Private node do knows Public node.
> >    The modprobe.conf likes the following:
> >    On server: options lnet networks="tcp0(eth0),tcp1(eth1)"
> >    On clients: options lnet networks=tcp  <--- since there is only one
>
> I think you'd need to make clients in the 72.203.10.* network use tcp0
> and clients in the 192.168.10.* tcp1. To create a uniform module
>

It did the trick!  Thanks!

option that works across the whole cluster, 'ip2nets' is your friend:
>
> options lnet 'ip2nets="tcp0(eth0) 72.203.10.*; tcp1(eth1)
> 192.168.10.[1-10]; tcp1(eth0) 192.168.10.[100-200]"
>
> (assuming that servers are 192.168.10.[1-10] and clients are
> 192.168.10.[100-200].)
>
>
There are three small issues about ip2nets:

1. Looks like LNET does not like the single quotes 'ip2nets="tcp0(eth0)
72.203.10.*; tcp1(eth1) 192.168.10.*"'. It says

lnet: Unknown parameter `'ip2nets'

After removed single quotes, I can load lnet module.

2. According to my observation, on public network, to mount /data using
above ip2nets option is slower than networks option.

3. On private network node, I cannot start LNET using ip2nets option
[root at private ~]# lsmod |grep lnet
lnet                  273084  1 ksocklnd
libcfs                136180  2 ksocklnd,lnet
[root at private ~]# lctl network configure
LNET configure error 100: Network is down

/var/log/messages shows:
LustreError: 31927:0:(socklnd.c:2545:ksocknal_startup()) Interface eth1 is
down
LustreError: 105-4: Error -100 starting up LNI tcp

But if you use networks option (options lnet networks=tcp1), it works well.
Do you think what is the problem?

Thanks,
Goodman

> Isaacc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091112/07529e77/attachment.htm>


More information about the lustre-discuss mailing list