[Lustre-discuss] How to debug a client's eviction.

Theodoros Stylianos Kondylis kondil at gmail.com
Mon Feb 4 09:06:06 PST 2013


Hello everyone,

We are facing a problem in our production system. A user's application is
creating concurrently 12,000 files  (containing the solution) but for some
reason one of the user's computational nodes gets evicted because of a
timeout before the writing procedure is completed, thus the files are not
properly written.

I try to debug this situation so I did the following ::

>> echo 1 > /proc/sys/lustre/dump_on_eviction
>> echo 1 > /proc/sys/lustre/dump_on_timeout

And in the /proc/sys/lnet/debug file there is ::

ioctl neterror warning error emerg ha config console

I would like to ask if there is any other flag I can enable that will help
me debug this situation?

Thank you in advance for any reply,
Stelios.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20130204/19e71d69/attachment.htm>


More information about the lustre-discuss mailing list