[Lustre-discuss] [HPDD-discuss] Failure to connect to some OST from a client machine
Kris Howard
khoward at eng.utah.edu
Thu Sep 5 13:01:19 PDT 2013
Might check lctl show_route and look for downed routes.
On Thu, Sep 5, 2013 at 12:56 PM, Bob Ball <ball at umich.edu> wrote:
> We are running Lustre 2.1.6 on Scientific Linux 6.4, kernel
> 2.6.32-358.11.1.el6.x86_64. This was an upgrade from Lustre 1.8.4 on SL5.
>
> We have had a few situations lately where a client stops talking to some
> subset of the OST (about 58 of these total on 8 OSS, nearly 500TB in
> total). I have a couple of questions.
>
> 1. "lctl dl" on the OSS shows a smaller count on the affected servers; on
> the client, all OSS showed UP in "lctl dl". Today, I first tried rebooting
> this OSS, but that did not change the situation. I ended up rebooting the
> client before I could get full connectivity. Is there any way from the
> client to get the reconnect, short of rebooting that client?
>
> 2. It used to be the case under Lustre 1.8.4 that I could run "lfs df -h"
> on the client, and see all OST, even those where the connection was not
> working, for whatever reason. That is no longer the case, now the lfs
> command stops at the first, non-talking OST. This seems more like a bug
> than a feature. Is there some other way to see a list of non-communicating
> OST on a client?
>
> Thanks in advance for any help offered.
>
> bob
>
>
>
> ______________________________**_________________
> HPDD-discuss mailing list
> HPDD-discuss at lists.01.org
> https://lists.01.org/mailman/**listinfo/hpdd-discuss<https://lists.01.org/mailman/listinfo/hpdd-discuss>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20130905/74f75dcb/attachment.htm>
More information about the lustre-discuss
mailing list