[Lustre-discuss] Lustre I/O server (OSS) goes down, another I/O server takes the service overloads itself and performance goes down.

Joan Marc joanmarcriera at gmail.com
Thu Aug 25 09:37:11 PDT 2011


Hello,

I'm having something that I can not understand. I quite new on lustre.


n4 and n5 are two I/O servers, each one taking care of diferent datasets.

When one of them goes down, the other is supposed to handle the service as
backup node. But n4 goes down without having heavy loads, and when n5 gets
the service its cpu and memory goes to the top and performance goes to the
bottom.

Can someone tell me which lines to look at , and from here I can start
checking this issue?

Here the a 30 minutes syslog with lustre related messages from n4 :
http://pastebin.com/q1iGwDxw

Here almost the same 30 minutes from n5.
http://pastebin.com/4Bg5repa


Many thanks.

Marc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110825/fa236086/attachment.htm>


More information about the lustre-discuss mailing list