[Lustre-discuss] Which NID to use?

White, Cliff cliff.white at intel.com
Mon Mar 3 10:01:33 PST 2014


From: "<Chan Ching Yu>", Patrick <cychan at clustertech.com<mailto:cychan at clustertech.com>>
Date: Saturday, March 1, 2014 at 4:26 PM
To: Cliff White <cliff.white at intel.com<mailto:cliff.white at intel.com>>
Cc: "Mohr Jr, Richard Frank (Rick Mohr)" <rmohr at utk.edu<mailto:rmohr at utk.edu>>, <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>>
Subject: Re: [Lustre-discuss] Which NID to use?


Hi White,

call me cliff.

tcp0(eth0) and tcp1(eth1) are connected to different segment. (connected to two virtual bridges in KVM)



Hi all,

In old Lustre manual (version 1.8), I found that the order of LNET in /etc/modprobe/lustre.conf does matter:

(Quoted in https://wiki.lustre.org/manual/LustreManual18_HTML/MoreComplicatedConfigurations.html)

"The order of LNET lines in modprobe.conf is important when configuring multi-homed servers. If a server node can be reached using more than one network, the first network specified in modprobe.conf will be used."

That makes me more confused. Someone told me the order doesn't matter, the file just list all the available LNET devices to use.

Does the order does matter ONLY in old version of Lustre?

I believe this only applies when the two interface use the same network type. In other words the case for two @tcp interface is different than the case where one interface is @tcp and the other is @o2ib.  Again, if you have only one o2ib LNET network, and you specify an @o2ib NID, traffic will only use IB.  If you had two IB networks, and the host was reachable on both networks the order in modprobe.conf would apply.
IB is a separate LNET network type, and if you specify @o2ib NIDS, traffic will stay on IB. It a different situation that having two TCP/IP interfaces, or two IB interfaces.
As Keith mentions, if you do have multiple Ethernet interface, bonding is the preferred solution.

Best,
cliffw




Regards,

Patrick







On Fri, 28 Feb 2014 21:20:58 +0000, White, Cliff wrote:

On 2/28/14, 1:17 AM, "Chan Ching Yu Patrick" <cychan at clustertech.com<mailto:cychan at clustertech.com>>
wrote:


Hi Mohr,

The reason why I made this setup is I'm not sure how Lustre selects the
interface in mult-rail environment.

Especially when all node have Infiniband and Ethernet, how can I ensure
Infiniband is used between client and OSS?


The LNET Œnetworks¹ option is used to specify by interface. For example,
where your Infiniband interface is Œib0¹ you would
add this to your modprobe.conf  or equivalent:
‹‹‹‹‹‹‹
options lnet networks="o2ib0(ib0)²
‹‹‹‹‹‹

That will define IB (the interface denoted by ib0 to be specific).  Client
mounts using @o2ib0 NIDS will only use IB,regardless of other interfaces
present.
See the Lustre manual for details on the LNET Œnetworks¹ option.

In your case, I would suspect that the two TCP/IP interfaces are
equivalent in TCP/IP routing terms, perhaps on the same segment.
When that happens TCP/IP routing is taking over. Basically, you can
control which interface you send from, but if the receiver sees two equal
TCP/IP paths back, you can¹t control which path it chooses to take. Has
nothing to do with LNET or Lustre.

In the case where the network hardware is dissimilar, you don¹t have this
problem. Connections starting on IB stay on IB.
If you only have one IB network, using the IB NID will ensure all clients
use only IB.

cliffw




Regards,
Patrick



On 02/27/2014 12:28 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:

On Feb 26, 2014, at 7:14 PM, "Chan Ching Yu,
Patrick"<cychan at clustertech.com<mailto:cychan at clustertech.com>>
wrote:


[root at mds1 ~]# lctl list_nids
192.168.122.240 at tcp
192.168.100.100 at tcp1

[root at oss1 ~]# lctl list_nids
192.168.122.194 at tcp
192.168.100.101 at tcp1

[root at client ~]# lctl list_nids
192.168.122.70 at tcp
192.168.100.102 at tcp1


On Lustre client, I intentionally mount it with tcp1

[root at client ~]# mount | grep lustre
192.168.100.100 at tcp1:/data on /lustre type lustre (rw)


Now I dd a file on Lustre filesystem, you can see that tcp0 is used
when writing on OST.
Why?

I am not an expert on the inner workings of lustre, but as far as I
understand it, when oss1 connects to the mgs, it will report the nids it
has available.  When the client connects to mgs to get info about the
oss1 server, it will receive a list of all the oss1 nids.  The client
then steps through that list and compares the oss1 nids with its local
nids to find a match (i.e. - nids that are on the same lnet network).
If it matches tcp0 first, then that is the connection it uses.  The lnet
network used to connect to the mgs is irrelevant at that point.
However, I do not know if there are any guarantees about the ordering of
the nids that the mgs will report (ie - will tcp0 always be the first
nid?).

If there is an error in my description, hopefully a lustre developer
will point out the flaw.

It is not clear what you are trying to accomplish with this multi rail
setup.  Are you trying to force mds traffic over one client link and oss
traffic over the other?  Or are you trying to utilize both links
simultaneously for all traffic?




_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org>http://lists.lustre.org/mailman/listinfo/lustre-discuss






More information about the lustre-discuss mailing list