[lustre-discuss] "Not on preferred path" error

Wed Sep 21 10:37:35 PDT 2016

It appears that there is only one SAS path to the back storage, which explained why some of LUN showed on non-preferred path. 

Typically we recommend to have two SAS connections from each OSS to the storage. One connects to the upper controller and one connects to the lower controller. Then, distributed LUNs between two controllers. In the event of SAS connection failure, all LUNs would failover to one controller. The one used to go through the other controller would shows that they are not on the preferred path. As this kind of failover happened on the multipath layer, it's transparent to Lustre. The file system continues to run as you observed. 

Best Regards,
Zhiqi

-----Original Message-----
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Lewis Hyatt
Sent: Tuesday, September 20, 2016 12:53 PM
To: Ben Evans <bevans at cray.com>; lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] "Not on preferred path" error

I see, thanks. This is what we see from running multipath cmds... i don't see anything that means anything to me, but FWIW it looks the same as on our other OSS that is working ok.

$multipath -ll
map03 (360080e50002ee5100000023f50092c6c) dm-13 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:1:3 sdk 8:160 [active][ready]
map02 (360080e50002ee4100000024250092c11) dm-12 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:1:2 sdj 8:144 [active][ready]
map01 (360080e50002ee5100000023b50092c4c) dm-11 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:1 sdi 8:128 [active][ready]
map00 (360080e50002ee4100000023e50092bf2) dm-10 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:0 sdh 8:112 [active][ready]
map09 (360080e50002ee4dc000002f250092c62) dm-7 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:0:3 sde 8:64  [active][ready]
map11 (360080e50002ee4dc000002f650092c84) dm-9 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:5 sdg 8:96  [active][ready]
map08 (360080e50002ec890000002e550092a07) dm-6 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:0:2 sdd 8:48  [active][ready]
map10 (360080e50002ec890000002e950092a27) dm-8 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:4 sdf 8:80  [active][ready]
map07 (360080e50002ee4dc000002ee50092c44) dm-5 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:1 sdc 8:32  [active][ready]
map06 (360080e50002ec890000002e1500929e9) dm-4 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:0 sdb 8:16  [active][ready]
map05 (360080e50002ee5100000024350092c8c) dm-15 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:5 sdm 8:192 [active][ready]
map04 (360080e50002ee4100000024650092c31) dm-14 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:4 sdl 8:176 [active][ready]

===========

$multipath -r
reload: map06 (360080e50002ec890000002e1500929e9)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:0 sdb 8:16  [active][ready]
reload: map07 (360080e50002ee4dc000002ee50092c44)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:1 sdc 8:32  [active][ready]
reload: map08 (360080e50002ec890000002e550092a07)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:2 sdd 8:48  [active][ready]
reload: map09 (360080e50002ee4dc000002f250092c62)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:3 sde 8:64  [active][ready]
reload: map10 (360080e50002ec890000002e950092a27)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:4 sdf 8:80  [active][ready]
reload: map11 (360080e50002ee4dc000002f650092c84)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:5 sdg 8:96  [active][ready]
reload: map00 (360080e50002ee4100000023e50092bf2)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:0 sdh 8:112 [active][ready]
reload: map01 (360080e50002ee5100000023b50092c4c)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:1 sdi 8:128 [active][ready]
reload: map02 (360080e50002ee4100000024250092c11)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:2 sdj 8:144 [active][ready]
reload: map03 (360080e50002ee5100000023f50092c6c)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:3 sdk 8:160 [active][ready]
reload: map04 (360080e50002ee4100000024650092c31)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:4 sdl 8:176 [active][ready]
reload: map05 (360080e50002ee5100000024350092c8c)  LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:5 sdm 8:192 [active][ready]

Thanks again for the assistance all, I really appreciate it.

-lewis

On 9/20/16 2:48 PM, Ben Evans wrote:
> multipath is a linux utility which handles communications from the 
> server to the disk array.  It is independent of Lustre or Infiniband.  
> For OSSes, each OSS had 2 connections to each storage array it 
> communicated with, usually there were a pair of arrays per OSS pair 
> (except for in a rare handful of our systems which had 1).
>
> -Ben Evans
>
> On 9/20/16, 2:33 PM, "lustre-discuss on behalf of Lewis Hyatt"
> <lustre-discuss-bounces at lists.lustre.org on behalf of 
> lhyatt at gmail.com>
> wrote:
>
>> Thanks so much for the information, we will look into this asap.
>> Forgive my ignorance, but is multipath here referring to some 
>> lustre-specific or infiniband-related process? Not familiar with it 
>> in this context.
>> Thanks again.
>>
>> -lewis
>>
>>
>> On 9/20/16 2:24 PM, Ben Evans wrote:
>>> Lewis,
>>>
>>> Yes, "Not on preferred path" is something that bubbles up through 
>>> the TS gui from multipath.
>>>
>>> A simple thing you can check is running multipath -ll on the OSS 
>>> (and it's
>>> peer) in question and seeing if it reports that one or more path is 
>>> down.
>>> If it's just on one OSS, try running 'multipath -r'.  If it doesn't 
>>> come back and look OK, then it's most likely a cable issue, and you 
>>> can try re-seating it to see if it helps.  It's been a long time 
>>> since I diagnosed this, though and can't remember the details of how 
>>> to associate cables with paths, though there should be indicator 
>>> lights on the back of everything and the path that is down should be 
>>> red.
>>>
>>> The high load is probably associated with the cable issue, since 
>>> you're putting more strain on one path.
>>>
>>> -Ben Evans
>>>
>>> On 9/20/16, 12:21 PM, "lustre-discuss on behalf of Lewis Hyatt"
>>> <lustre-discuss-bounces at lists.lustre.org on behalf of 
>>> lhyatt at gmail.com>
>>> wrote:
>>>
>>>> Hello-
>>>>
>>>> I am having an issue with a lustre 1.8 array that I have little 
>>>> hope of figuring out on my own, so I thought I would try here to 
>>>> see if anyone might know what this warning/error means. Our array 
>>>> was built by Terascala, which no longer exists, so we have no 
>>>> support for it and little documentation (and not much in-house 
>>>> knowledge). I see this complaint "Not on preferred path" on the GUI 
>>>> that we have, which I assume was something custom made by 
>>>> Terascala, and I am not sure even what path it is referring to; we 
>>>> use infiniband for all connections and it could relate to this, but 
>>>> not sure. We see this error on 3 of the 12 OSTs. More specifically, 
>>>> we have 2 OSSs, each handling 6 OSTs, and all 3 of the "not on optimal path" OSTs are on the same OSS.
>>>>
>>>> We do not know if it's related, but this same OSS is in a very bad 
>>>> state, with very high load average (200), very high I/O wait time, 
>>>> and taking many seconds to respond to each read request, making the 
>>>> array more or less unusable. That's the problem we are trying to fix.
>>>>
>>>> I realize there's not much hope for anyone to help us with that 
>>>> given how little information I am able to provide. But I was hoping 
>>>> someone out there might know what this "not on optimal path" error 
>>>> means, and if it matters for anything or not, so we have somewhere to start.
>>>> Thanks very much!
>>>>
>>>> I could provide screen shots of the management GUI we have, if it 
>>>> would be informative.
>>>>
>>>> -Lewis
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org