[Lustre-discuss] Fast error reporting

Ms. Megan Larko dobsonunit at gmail.com
Sat Mar 8 13:21:40 PST 2014


Just my $0.02 here.

I am in agreement with Mr. A. Dilger.  I am a vote in favor of the present
Lustre default behavior.   The pausing of operations is a good Lustre
feature for us.   I have worked with various systems in which a network
hiccup will not crash the job. In the present Lustre behavior; the job will
just pause for a bit (a configurable number, if I recall correctly).  We
have left the default value in place.  It prevents us from having jobs fail
because of momentary (one minute or less) holds in the network traffic.

If Yao wishes it to be a shorter time to failing the job, I think he should
have the freedom to configure the value that works for him.

My opinion, YMMV.
Cheers,
megan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20140308/7d26b7a4/attachment.htm>


More information about the lustre-discuss mailing list