[lustre-discuss] ​luster client mount issues

Mohr Jr, Richard Frank (Rick Mohr) rmohr at utk.edu
Mon Aug 1 07:42:46 PDT 2016


> On Jul 28, 2016, at 9:54 PM, sohamm <sohamm at gmail.com> wrote:
> 
> Client is configured for IB interface. 

So it looks like there might be something wrong with the LNet config on the client then.  Based on the output from “lctl ping” that you ran from the server, the client only reported a NID on the tcp network.

> in my understanding i can specific the network of choice in the mount command. tried both tcp and ib.

That is true, but sometimes if the client and server both have interfaces on two different networks (like ethernet and IB) there can be some subtle issues.  When you specify the NID for the MGS to mount the file system, the client will retrieve information about the MDS/OSS servers from the MGS you specified.  This information includes the NIDS that the MDS/OSS servers will listen for requests.  If a client sees that a server has a NID on tcp0 and a NID on o2ib0, and the client also has NIDs on tcp0 and o2ib0, then the client sees that there are two paths to the same server and it will just pick one of the paths (which might not be the one you want).  And if the path it chooses happens to be down, it won’t matter if the other path is up.

(Now, I should make a disclaimer about the above statements.  I believe that is how it worked on Lustre versions like 1.8 and 2.4.  I have not tried this with newer Lustre versions, so the behavior could be different.  I also have not experimented with anything like specifying weights for LNet routes, so I don’t know if that could be used to prefer one interface over another.)

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu



More information about the lustre-discuss mailing list