[Lustre-discuss] Another server question.

Oleg Drokin Oleg.Drokin at Sun.COM
Tue Feb 3 11:22:59 PST 2009


Hello!

On Feb 3, 2009, at 12:21 PM, Charles Taylor wrote:
>>> Many here
>>> have noted that it should be ok, with the exception of files that
>>> were
>>> stored on the downed server,
> Again, not in our experience.    We are currently running 1.6.4.2 and
> have never seen this work.    Losing a single OSS renders the file
> system pretty much unusable until the OSS has recovered.    We could
> be doing something wrong, I suppose but I'm not sure what.

After one of the OSSes is down, what sort of error messages do you get?
on stuck clients that do not try to access files from those OSSes?
Is it anything about problems contacting MDS by any chance?
There were some bugs fixed in 1.6.6 and 1.6.7 that could easy this
situation.
E.g. see bugs 13375 and 16006.
So perhaps consider upgrading your system and let us know if it still
does not work for you.

Bye,
     Oleg



More information about the lustre-discuss mailing list