[Lustre-discuss] Lustre crashes periodically
Arya Mazaheri
aryanet at gmail.com
Wed Oct 9 01:48:33 PDT 2013
Sorry, I have to correct this: "the nodes CANNOT mount the storage and I can't access the Lustre server machine neither".
On Wednesday ۱۷ July ۱۳۹۲ at ۱۱:۲۱, Arya Mazaheri wrote:
> Hi everyone,
> I have a problem lately with our Lustre 1.8 deployment. It crashes periodically in a way that the nodes can mount the storage and I can't access the Lustre server machine neither. So I have to manually restart the machine every time to make everything normal again. I tried to see the logs, memory usage and locks count to see whether these issues may have the cause of the problem. But, I don't think they account for this issue.
> An interesting symptom I see every time this problem happens is the Infiniband switch network usage lights which blink very fast. I think a huge traffic on the Infiniband network to the lustre server may cause the server crash. Does this relevance seems logical?
>
> Anyway, I hope some of you may have experience this problem before and could help me understand what is happening and how to avoid crashing the server again!
>
> Thanks,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20131009/a76a66d6/attachment.htm>
More information about the lustre-discuss
mailing list