[Lustre-discuss] Recovery timeouts

Thomas Roth t.roth at gsi.de
Thu Mar 5 13:30:32 PST 2009


Hi,

yet another question coming up due to our bad luck with our MDS these days:
When a restarted MDS goes into recovery, it reports the ETA in /proc/fs/lustre/mds/Name/recovery_status

How is this time calculated?

I'm asking because in our recent cases, recovery starts with ETAs of 15000 - 21000 sec.
We are using Lustre 1.6.5.1 with adaptive timeouts enabled, and I wonder how the min/max values
given there affect the recovery times.

On our old test cluster, there are no adaptive timeouts, but the notorious static value of 1000s.
Recovery of that MDS took ~3000s the last time.

Regards,
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: t_roth.vcf
Type: text/x-vcard
Size: 298 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090305/104c5a1e/attachment.vcf>


More information about the lustre-discuss mailing list