[Lustre-discuss] [HPDD-discuss] Failure to connect to some OST from a client machine

Kris Howard khoward at eng.utah.edu
Thu Sep 5 13:01:19 PDT 2013


Might check lctl show_route and look for downed routes.


On Thu, Sep 5, 2013 at 12:56 PM, Bob Ball <ball at umich.edu> wrote:

> We are running Lustre 2.1.6 on Scientific Linux 6.4, kernel
> 2.6.32-358.11.1.el6.x86_64.  This was an upgrade from Lustre 1.8.4 on SL5.
>
> We have had a few situations lately where a client stops talking to some
> subset of the OST (about 58 of these total on 8 OSS, nearly 500TB in
> total).  I have a couple of questions.
>
> 1. "lctl dl"  on the OSS shows a smaller count on the affected servers; on
> the client, all OSS showed UP in "lctl dl".  Today, I first tried rebooting
> this OSS, but that did not change the situation.  I ended up rebooting the
> client before I could get full connectivity.  Is there any way from the
> client to get the reconnect, short of rebooting that client?
>
> 2. It used to be the case under Lustre 1.8.4 that I could run "lfs df -h"
> on the client, and see all OST, even those where the connection was not
> working, for whatever reason.  That is no longer the case, now the lfs
> command stops at the first, non-talking OST. This seems more like a bug
> than a feature.  Is there some other way to see a list of non-communicating
> OST on a client?
>
> Thanks in advance for any help offered.
>
> bob
>
>
>
> ______________________________**_________________
> HPDD-discuss mailing list
> HPDD-discuss at lists.01.org
> https://lists.01.org/mailman/**listinfo/hpdd-discuss<https://lists.01.org/mailman/listinfo/hpdd-discuss>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20130905/74f75dcb/attachment.htm>


More information about the lustre-discuss mailing list