[lustre-discuss] "Not on preferred path" error

Lewis Hyatt lhyatt at gmail.com
Tue Sep 20 11:33:55 PDT 2016


Thanks so much for the information, we will look into this asap.
Forgive my ignorance, but is multipath here referring to some lustre-specific 
or infiniband-related process? Not familiar with it in this context. Thanks again.

-lewis


On 9/20/16 2:24 PM, Ben Evans wrote:
> Lewis,
>
> Yes, "Not on preferred path" is something that bubbles up through the TS
> gui from multipath.
>
> A simple thing you can check is running multipath -ll on the OSS (and it's
> peer) in question and seeing if it reports that one or more path is down.
> If it's just on one OSS, try running 'multipath -r'.  If it doesn't come
> back and look OK, then it's most likely a cable issue, and you can try
> re-seating it to see if it helps.  It's been a long time since I diagnosed
> this, though and can't remember the details of how to associate cables
> with paths, though there should be indicator lights on the back of
> everything and the path that is down should be red.
>
> The high load is probably associated with the cable issue, since you're
> putting more strain on one path.
>
> -Ben Evans
>
> On 9/20/16, 12:21 PM, "lustre-discuss on behalf of Lewis Hyatt"
> <lustre-discuss-bounces at lists.lustre.org on behalf of lhyatt at gmail.com>
> wrote:
>
>> Hello-
>>
>> I am having an issue with a lustre 1.8 array that I have little hope
>> of figuring out on my own, so I thought I would try here to see if
>> anyone might know what this warning/error means. Our array was built
>> by Terascala, which no longer exists, so we have no support for it and
>> little documentation (and not much in-house knowledge). I see this
>> complaint "Not on preferred path" on the GUI that we have, which I
>> assume was something custom made by Terascala, and I am not sure even
>> what path it is referring to; we use infiniband for all connections
>> and it could relate to this, but not sure. We see this error on 3 of
>> the 12 OSTs. More specifically, we have 2 OSSs, each handling 6 OSTs,
>> and all 3 of the "not on optimal path" OSTs are on the same OSS.
>>
>> We do not know if it's related, but this same OSS is in a very bad
>> state, with very high load average (200), very high I/O wait time, and
>> taking many seconds to respond to each read request, making the array
>> more or less unusable. That's the problem we are trying to fix.
>>
>> I realize there's not much hope for anyone to help us with that given
>> how little information I am able to provide. But I was hoping someone
>> out there might know what this "not on optimal path" error means, and
>> if it matters for anything or not, so we have somewhere to start.
>> Thanks very much!
>>
>> I could provide screen shots of the management GUI we have, if it
>> would be informative.
>>
>> -Lewis
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>


More information about the lustre-discuss mailing list