<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi, Ron-<br>
<br>
Thanks for sharing your config with me. I tried tweaking ours, and
it's still a no go. I think the main difference here is that it's
our client (not the servers) that is multi-homed.<br>
<br>
The client needs to access:<br>
<ol>
<li>one (eventually more) Lustre filesystem(s) via direct
attached InfiniBand.</li>
<li>one Lustre file system via TCP (no TCP->IB routing)</li>
<li>several Lustre file systems via routed InfiniBand
(TCP->IB)<br>
</li>
</ol>
I can't get #1 and #3 working together ... can get one or the
other working depending on how I've configured the lnet networks
in modprobe.d/lustre.conf, but not both. (#2 works either way)<br>
<br>
Does anyone else have ideas on this?<br>
<br>
Thanks!<br>
<br>
John<br>
<br>
<br>
On 3/24/14, 4:13 PM, Jerome, Ron wrote:<br>
</div>
<blockquote
cite="mid:37B44587125CDC47942EF3600DD8539F8E3AF5AC1A@NRCCENMB2.nrc.ca"
type="cite">
<pre wrap="">Hi John,
Don't know if you got this working, but I can tell you that I have more or less the same setup working. Basically I have a client on a public TCP network connecting to an LNET router (via TCP) which then forwards via IB to the Luster cluster. (all the lustre servers are multi-homed and have a tcp0 network internally, thus the "tcp1" for the external TCP network)
The external client config is... (where 132.246.x.x is the TCP address of the router)
---------------------------------
options lnet networks=tcp1(eth0) routes="o2ib0 132.246.x.x@tcp1"
...</pre>
</blockquote>
<blockquote
cite="mid:37B44587125CDC47942EF3600DD8539F8E3AF5AC1A@NRCCENMB2.nrc.ca"
type="cite">
<pre wrap="">
Ron.
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss-bounces@lists.lustre.org">lustre-discuss-bounces@lists.lustre.org</a> [<a class="moz-txt-link-freetext" href="mailto:lustre-discuss-bounces@lists.lustre.org">mailto:lustre-discuss-bounces@lists.lustre.org</a>] On Behalf Of John Lalande
Sent: March 21, 2014 3:56 PM
To: <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
Subject: [Lustre-discuss] multi-homed lustre with both IB and TCP
Hi-
I am trying to set up a robinhood policy engine server that will watch
several different Lustre file systems -- one of which will have a direct
Infiniband connection, one via TCP without an intermediate Lustre router
and several other Lustre file systems via TCP through Lustre routers.
I can mount filesystems via IB and direct TCP, but not the routed ones.
(I am able to mount the routed ones if I take out the config for o2ib0@ib0).
My modprobe.conf looks like this:
options lnet networks="o2ib0(ib0),tcp0(em1.497)" routes="o2ib1
ROUTER1_IP@tcp0; o2ib1 ROUTER2_IP@tcp0; o2ib1 ROUTER3_IP@tcp0"
where router1_IP, router2_IP, etc. are actual IP addresses on our
University's subnet that I don't want to publish here.
/etc/fstab looks like this:
172.17.1.5@o2ib0:/ib_filesystem /ib_filesystem lustre
defaults,_netdev,user_xattr 0 0
172.16.24.5@o2ib1:/routedfs1 /fs1 lustre
defaults,_netdev,user_xattr 0 0
172.16.23.14@o2ib1:/routedfs2 /fs2 lustre
defaults,_netdev,user_xattr 0 0
172.16.25.189@o2ib1:/routedfs3 /fs3 lustre
defaults,_netdev,user_xattr 0 0
172.16.25.241@o2ib1:/routedfs4 /fs4 lustre
defaults,_netdev,user_xattr 0 0
128.104.X.X@tcp:/tcpfs1 /tcpfs1 lustre
defaults,_netdev 0 0
In dmesg, I see:
Lustre: 6923:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for slow reply: [sent 1395431267/real 1395431267]
req@ffff880c2aa04800 x1463215106031860/t0(0)
o250-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 400/544 e 0 to 1
dl 1395431272 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
LustreError: 7239:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req@ffff880c2aa04000 x1463215106031864/t0(0)
o101-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 7230:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req@ffff88182b1fac00 x1463215106031872/t0(0)
o101-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 7230:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req@ffff88182a1ab000 x1463215106031876/t0(0)
o101-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
Lustre: 6923:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for slow reply: [sent 1395431292/real 1395431292]
req@ffff88182a2a3400 x1463215106031976/t0(0)
o250-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 400/544 e 0 to 1
dl 1395431302 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
LustreError: 7239:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req@ffff880c2aa04000 x1463215106031868/t0(0)
o101-><a class="moz-txt-link-abbreviated" href="mailto:MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25">MGC172.16.24.5@o2ib1@172.16.24.5@o2ib1:26/25</a> lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
So ... is what we're trying to do here possible, and I'm just mangling
the config, or is Lustre over IB + Lustre via IB router not possible?
Thanks for any help!
John
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
John Lalande
Space Science & Engineering Center
University of Wisconsin - Madison
<a class="moz-txt-link-abbreviated" href="mailto:john.lalande@ssec.wisc.edu">john.lalande@ssec.wisc.edu</a> / 608-263-2268</pre>
</body>
</html>