[Lustre-discuss] Dual NICs issue -- How to enforce Lustre to use the second NIC
Daneil Goodman
daneil.goodman at gmail.com
Thu Nov 12 08:21:41 PST 2009
On Wed, Nov 11, 2009 at 9:20 PM, Isaac Huang <He.Huang at sun.com> wrote:
> On Wed, Nov 11, 2009 at 04:07:39PM -0600, Daneil Goodman wrote:
> > Hello list,
> > By searching the archive, I found a similar message dated back in
> > January 2008 -- How do you make an MGS/OSS listen on 2 NICs? Looks
> like
> > there is no final solution and I am facing the similar situation and
> > need your help.
> > I am running centos 5 on both server (MGS, MDS and OSS are on same
> > node) and clients: 2.6.18-128.1.6.el5_lustre.1.8.0.1smp. To simplify
> > the issue, suppose the network is consist of one lustre server node
> and
> > two lustre client nodes. The server node has two NICs: eth0(100Mb) and
> > eth1(1Gb), each client node only has one NIC:eth0. The network layout
> > is as below.
> > Server node eth0: 72.203.10.1 (Public network) <==> Switch1 <==>
> > Public node eth0: 72.203.10.2 (Public network)
> > Server node eth1: 192.168.10.1 (Internal network) <==> Switch2 <==>
> > Private node eth0: 192.168.10.2 (Internal network)
> > Both SELinux and Fireware are turned off. Public node does not know
> > Private node, but Private node do knows Public node.
> > The modprobe.conf likes the following:
> > On server: options lnet networks="tcp0(eth0),tcp1(eth1)"
> > On clients: options lnet networks=tcp <--- since there is only one
>
> I think you'd need to make clients in the 72.203.10.* network use tcp0
> and clients in the 192.168.10.* tcp1. To create a uniform module
>
It did the trick! Thanks!
option that works across the whole cluster, 'ip2nets' is your friend:
>
> options lnet 'ip2nets="tcp0(eth0) 72.203.10.*; tcp1(eth1)
> 192.168.10.[1-10]; tcp1(eth0) 192.168.10.[100-200]"
>
> (assuming that servers are 192.168.10.[1-10] and clients are
> 192.168.10.[100-200].)
>
>
There are three small issues about ip2nets:
1. Looks like LNET does not like the single quotes 'ip2nets="tcp0(eth0)
72.203.10.*; tcp1(eth1) 192.168.10.*"'. It says
lnet: Unknown parameter `'ip2nets'
After removed single quotes, I can load lnet module.
2. According to my observation, on public network, to mount /data using
above ip2nets option is slower than networks option.
3. On private network node, I cannot start LNET using ip2nets option
[root at private ~]# lsmod |grep lnet
lnet 273084 1 ksocklnd
libcfs 136180 2 ksocklnd,lnet
[root at private ~]# lctl network configure
LNET configure error 100: Network is down
/var/log/messages shows:
LustreError: 31927:0:(socklnd.c:2545:ksocknal_startup()) Interface eth1 is
down
LustreError: 105-4: Error -100 starting up LNI tcp
But if you use networks option (options lnet networks=tcp1), it works well.
Do you think what is the problem?
Thanks,
Goodman
> Isaacc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091112/07529e77/attachment.htm>
More information about the lustre-discuss
mailing list