[Lustre-discuss] multi-homed lustre with both IB and TCP
John Lalande
john.lalande at ssec.wisc.edu
Fri Mar 21 12:56:10 PDT 2014
Hi-
I am trying to set up a robinhood policy engine server that will watch
several different Lustre file systems -- one of which will have a direct
Infiniband connection, one via TCP without an intermediate Lustre router
and several other Lustre file systems via TCP through Lustre routers.
I can mount filesystems via IB and direct TCP, but not the routed ones.
(I am able to mount the routed ones if I take out the config for o2ib0 at ib0).
My modprobe.conf looks like this:
options lnet networks="o2ib0(ib0),tcp0(em1.497)" routes="o2ib1
ROUTER1_IP at tcp0; o2ib1 ROUTER2_IP at tcp0; o2ib1 ROUTER3_IP at tcp0"
where router1_IP, router2_IP, etc. are actual IP addresses on our
University's subnet that I don't want to publish here.
/etc/fstab looks like this:
172.17.1.5 at o2ib0:/ib_filesystem /ib_filesystem lustre
defaults,_netdev,user_xattr 0 0
172.16.24.5 at o2ib1:/routedfs1 /fs1 lustre
defaults,_netdev,user_xattr 0 0
172.16.23.14 at o2ib1:/routedfs2 /fs2 lustre
defaults,_netdev,user_xattr 0 0
172.16.25.189 at o2ib1:/routedfs3 /fs3 lustre
defaults,_netdev,user_xattr 0 0
172.16.25.241 at o2ib1:/routedfs4 /fs4 lustre
defaults,_netdev,user_xattr 0 0
128.104.X.X at tcp:/tcpfs1 /tcpfs1 lustre
defaults,_netdev 0 0
In dmesg, I see:
Lustre: 6923:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for slow reply: [sent 1395431267/real 1395431267]
req at ffff880c2aa04800 x1463215106031860/t0(0)
o250->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 400/544 e 0 to 1
dl 1395431272 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
LustreError: 7239:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req at ffff880c2aa04000 x1463215106031864/t0(0)
o101->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 7230:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req at ffff88182b1fac00 x1463215106031872/t0(0)
o101->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 7230:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req at ffff88182a1ab000 x1463215106031876/t0(0)
o101->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
Lustre: 6923:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for slow reply: [sent 1395431292/real 1395431292]
req at ffff88182a2a3400 x1463215106031976/t0(0)
o250->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 400/544 e 0 to 1
dl 1395431302 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
LustreError: 7239:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send
limit expired req at ffff880c2aa04000 x1463215106031868/t0(0)
o101->MGC172.16.24.5 at o2ib1@172.16.24.5 at o2ib1:26/25 lens 328/344 e 0 to 0
dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
So ... is what we're trying to do here possible, and I'm just mangling
the config, or is Lustre over IB + Lustre via IB router not possible?
Thanks for any help!
John
--
John Lalande
Space Science & Engineering Center
University of Wisconsin - Madison
john.lalande at ssec.wisc.edu / 608-263-2268
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6251 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20140321/763be742/attachment.bin>
More information about the lustre-discuss
mailing list