[lustre-discuss] Multiple MGS interfaces config
Chris Hunter
chris.hunter at yale.edu
Thu Sep 24 08:33:19 PDT 2015
> My environment has both TCP and IB clients, so my Lustre config has to
> accommodate both, but I'm having a hard time figuring out the proper
syntax
> for it. Theoretically, I should be able to use comma-separated interfaces
> in the mgsnode parameter like this:
>
> --mgsnode=192.168.10.1 at tcp0,172.16.10.1 at o2ib
> --mgsnode=192.168.10.2 at tcp0,172.16.10.2 at o2ib
>
> The problem is, this doesn't work for all clients all the time ...
> randomly. It would work, then it wouldn't. Googling, I found some known
> defects saying that the comma delimiter didn't work as per the manual and
> recommending alternate syntaxes like using the colon instead of a
comma. I
> know what the manuals *say*about the syntax, I'm just having trouble
> getting it to work.
>
> This seems to affect only the TCP clients; at least I haven't seen it
> affect any of the IB clients. It may be a comma parsing problem or
> something else.
>
> I have two questions for the group:
>
> 1. Is there a known-working method for using both TCP and IB interface
> NIDs for the MGS in this manner?
I used quotes with comma-delimited listing when formatting osts eg)
mkfs.lustre --verbose
--ost --index=0 --fsname="testfs"
--mgsnode="172.16.10.1 at o2ib0,192.168.10.1 at tcp0" <OST_DEV>
When mounting on a multi-homed client, you can use both mgs addresses to
give some failover support:
mount -v -t lustre 172.16.10.1 at o2ib0,192.168.10.1 at tcp0:/testfs /mnt/testfs
FYI, I also have dual-home OSS servers, so I also use comma-delimited
list for the --servicenode parameter in mkfs.lustre.
> 2. What's the best way to trace the TCP client interactions to see where
> it's breaking down?
If lnet is running on the client, you can try "lctl ping"
eg) lctl ping 172.16.10.1 at o2ib
I believe a lustre mount uses ipoib for initial handshake with a mds
o2ib interfaces. You should make sure regular ping over ipoib is working
before mounting lustre.
> Versions in use:
> kernel: 2.6.32-504.23.4.el6.x86_64
> lustre: lustre-2.7.58-2.6.32_504.23.4.el6.x86_64_g051c25b.x86_64
> zfs: zfs-0.6.4-76_g87abfcb.el6.x86_64
>
> My lustre.conf contents:
> options lnet networks="o2ib0(ib1),tcp0(ixgbe1)"
chris hunter
chris.hunter at yale.edu
More information about the lustre-discuss
mailing list