[Lustre-discuss] 1.6.4.1 - active client evicted

Oleg Drokin Oleg.Drokin at Sun.COM
Fri Jan 11 06:49:33 PST 2008


Hello!

On Jan 11, 2008, at 4:00 AM, Niklas Edmundsson wrote:

>> As for your original message - hard to tell what caused it. We can  
>> see
>> that servers decided the client was unresponsive.
>> Could it be some network packet lost for example?
>> Were not there any other messages at around 12:20 and before that
>> (that's when it was evicted) on a client?
>> Because at 12:40 - that's already 20 minutes past eviction.
> Thats the weird thing - there's nothing lustre-related logged before
> that on the client that day! The client seems oblivious to the fact
> that it's been evicted, and this was while it was doing IO... Also the
> clocks are synced by ntp, and thus not off by much...

Clocks are not off at all, because there is a message in server log
corresponding to first error from client log you provided.
The fact that there were no prior messages in client log is very  
strange.

> I could accept network errors etc as an explanation, but then I would
> have assumed that the client would have logged stuff, tried
> reconnecting etc... As it was it was simply dead in the water until I
> rebooted the thing.

That's true.

> What mechanism does Lustre use to check if a peer is up? Since lctl
> ping worked between all nodes I suspect it uses something more
> involved. Can I trigger the same check using lctl?

Lustre sends periodic PING messages to servers with which it had no
communication for some time. Any network activity on filesystem that
triggers network traffic toward servers also works as a health check.
Since we did not see any timeouts in client logs, it looks there were no
traffic from client to servers for those 20 minutes at all, not even
lustre-generated pings which is pretty strange.
Too bad it is too late to ask for some lustre debug logs at this point.
If you can replicate the problem somehow, it would be interesting to
get lctl dk output.

Bye,
     Oleg




More information about the lustre-discuss mailing list