[Lustre-discuss] Problem with LNET configuration

Kevin Van Maren kevin.van.maren at oracle.com
Mon Jul 12 08:05:06 PDT 2010


Stefano Elmopi wrote:
>
> Hi,
>
> I have a Lustre file system, consisting of a MGS/MDS an two OSS, 
> interconnected with Infiniband.
> The version of Lustre is 1.8.3 and the SO of the servers is CentOS 5.4 
> and I used
> the following commands to their formatting:
>
> MGS/MDS:
> mkfs.lustre --mgs /dev/mpath/mpath1
> mount -t lustre /dev/mpath/mpath1 /MGS
> mkfs.lustre --mdt --fsname=lustre01 
> --mgsnode=172.16.100.111 at tcp0,192.168.150.1 at o2ib0 
> --mgsnode=172.16.100.121 at tcp0,192.168.150.11 at o2ib0 
> --failnode=172.16.100.121 at tcp0,192.168.150.11 at o2ib0 /dev/mpath/mpath2
> mount -t lustre /dev/mpath/mpath2 /MDS_1/
>
> OSS_1
> mkfs.lustre --ost --fsname=lustre01 
> --failnode=172.16.100.122 at tcp0,192.168.150.12 at o2ib0 
> --mgsnode=172.16.100.111 at tcp0,192.168.150.1 at o2ib0 
> --mgsnode=172.16.100.121 at tcp0,192.168.150.11 at o2ib0 /dev/mpath/mpath1
> mount -t lustre /dev/mpath/mpath1 /LUSTRE_1
>
> OSS_2
> mkfs.lustre --ost --fsname=lustre01 
> --failnode=172.16.100.121 at tcp0,192.168.150.11 at o2ib0 
> --mgsnode=172.16.100.111 at tcp0,192.168.150.1 at o2ib0 
> --mgsnode=172.16.100.121 at tcp0,192.168.150.11 at o2ib0 /dev/mpath/mpath2
> mount -t lustre /dev/mpath/mpath2 /LUSTRE_1
>
> and then there are two clients mounted, one on Ethernet and one on IB.
> I disconnected the IB cable to simulate the breaking of the IB card on 
> OSS_2.
> I modified the file modprobe.conf to start LNET with only Ethernet 
> card and then mount Lustre
> filesystem and the operation seems to be successful, the ethernet 
> client can see the entire filesystem.

You modified OSS_2 to be Ethernet only, right?  (As opposed to the client)

> The problem comes when I try to force a write on OSS_2 because writing 
> crashes ,and the operation goes wrong.
Yes, because the MDS is using InfiniBand, and is trying to access the 
OST over IB.  Since the OST has an IB NID, the MDS is trying to use that 
NID to talk to it: you would have to disable IB on the MDS node as well.

> Log on MGS/MDS:
>
> Jul 12 15:04:59 mdt01prdpom kernel: LustreError: 
> 4238:0:(events.c:66:request_out_callback()) @@@ type 4, status -113 
>  req at ffff81013ea52000 x1340531260082684/t0 
> o8->lustre01-OST0001_UUID at 172.16.100.121@tcp:28/4 lens 368/584 e 0 to 
> 1 dl 1278939908 ref 2 fl Rpc:N/0/0 rc 0/0
> Jul 12 15:04:59 mdt01prdpom kernel: LustreError: 
> 4238:0:(events.c:66:request_out_callback()) Skipped 16 previous 
> similar messages
> Jul 12 15:06:07 mdt01prdpom kernel: LustreError: 
> 4237:0:(lov_request.c:690:lov_update_create_set()) error creating fid 
> 0x10f8004 sub-object on OST idx 1/1: rc = -11
> Jul 12 15:06:07 mdt01prdpom kernel: LustreError: 
> 4237:0:(lov_request.c:690:lov_update_create_set()) Skipped 1 previous 
> similar message
> Jul 12 15:06:07 mdt01prdpom kernel: LustreError: 
> 4408:0:(mds_open.c:441:mds_create_objects()) error creating objects 
> for inode 17793028: rc = -5
> Jul 12 15:06:07 mdt01prdpom kernel: LustreError: 
> 4408:0:(mds_open.c:826:mds_finish_open()) mds_create_objects: rc = -5
>
>
> My question is:
>
> you can mount the server OSS_2 so that it can provide service with the 
> ethernet card ?
> If yes, What should I do?
>
>
> Thanks
>
>
>
> Ing. Stefano Elmopi
> Gruppo Darco - Resp. ICT Sistemi
> Via Ostiense 131/L Corpo B, 00154 Roma

Remember that for any node accessing another node it will always use the 
"best" NID they have in common, even if it doesn't work (Lustre assumes 
all networks on a server will always work -- the resource will be failed 
over to a healthy server).

If you really want to try this, see an example here: 
https://bugzilla.lustre.org/show_bug.cgi?id=19854 for a hack in 
specifying the NIDs as belonging to different servers.  Note that the 
servers only track a single NID, so they will not be able to do 
callbacks if the network path to the client goes down (ie, they will 
evict the clients, although the clients can reconnect over the other 
network).

The "better" approach is generally to us bonding to provide multiple 
physical links that look like a single network to Lustre.  Ethernet 
bonding works without additional patches.  See also bug 20153 and 20288 
for more patches for Lustre with ib-bonding.  There is an additional 
non-landed patch in bug private bug 22065.

Kevin





More information about the lustre-discuss mailing list