[lustre-discuss] "Not on preferred path" error

Joe Landman landman at scalableinformatics.com
Tue Sep 20 10:51:08 PDT 2016


On 09/20/2016 01:39 PM, Lewis Hyatt wrote:
> Thanks very much for the suggestions. dmesg output is here:
> http://pastebin.com/jCafCZiZ
> We don't see any disk-related stuff there, and also our GUI shows all
> the RAID arrays as being fine.

Hmmm .... I rarely trust GUIs for RAID.  Do you have underlying CLI 
tools you can do a sanity check with?

> If anything in there jumps out at you, I'd really appreciate your
> thoughts! We are almost certainly going to reboot the affected OSS later
> today to see how that goes.

Not seeing anything leap out other than two particular targets, 
twlstr-OST000b and twlstr-OST0006, appear to be "slow".  This appears to 
be what is causing client evictions, lock bits, etc.

The question is, why are these two OSTs slow.  What is the underlying 
RAID, how many operations are queued up, etc.?

A tool we recommend for (nearly instantaneous) holistic level views on a 
system is glances, which you can install via pip

	pip install glances

then run it as

	glances -t 1

to get a second by second view of your system.  Dstat is also good.

Dumb question ... what does

	swapon -s

report?  I am assuming you aren't swapping (and don't have swap enabled 
on the system, but it never hurts to ask).

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
w: http://scalableinformatics.com
t: @scalableinfo
p: +1 734 786 8423 x121
c: +1 734 612 4615


More information about the lustre-discuss mailing list