[Lustre-discuss] o2ib possible network problems -- solved
Ms. Megan Larko
dobsonunit at gmail.com
Mon Sep 22 13:17:27 PDT 2008
Hello All,
I honestly do not know how it happened, but the value in
/proc/sys/lustre/timeout on the OSS box was set to 100. All other
systems were set to 1000.
I changed the value on the OSS to 1000 and every error message on all
of the related systems stopped. I got the idea to re-check from an
e-mail message sent by Brian Murrell archived on os-dir referring to
bug 16237. Brian listed the above as another thing to check.
Interestingly enough, the readahead (blockdev --report /dev/sdX) on
the same OSS was set to 672. I have no idea where that came from
either. All of the other systems have a reported readahead value of
256. I had changed the readahead value on OSS box first (blockdev
--setra 256 /dev/sdX). The error messages did not stop until I fixed
the value in /proc/sys/lustre/timeout.
How could my /proc have such odd values in it?
I will see if the change holds for now. I may have to do something
to make it persistent for future reboots.
Cheers!
megan
More information about the lustre-discuss
mailing list