[Lustre-discuss] speedy server shutdown

Andreas Dilger adilger at Sun.COM
Mon Feb 9 22:42:09 PST 2009


On Feb 08, 2009  23:28 -0500, Robin Humble wrote:
> when shutting down our OSS's and then MDS's we often wait 330s for each
> set of umount's to finish eg.
>   Feb  2 03:20:06 xemds2 kernel: Lustre: Mount still busy with 68 refs, waiting for 330 secs...
>   Feb  2 03:20:11 xemds2 kernel: Lustre: Mount still busy with 68 refs, waiting for 325 secs...
>   ...
> is there a way to speed this up?

Please search bugzilla for this, I think there was a bug fixed in more
recent versions.

> we're interested in the (perhaps unusual) case where all clients are gone
> because the power has failed, and the Lustre servers are running on UPS
> and need to be shut down ASAP.
> 
> the tangible reward for a quick shutdown is that we can buy a lower
> capacity (cheaper) UPS if we can reliably and cleanly shutdown all the
> Lustre servers in <10mins, and preferably <3 minutes. if we're tweaking
> timeouts to do this then hopefully we can tweak them just before the
> shutdown and avoid running short timeouts in normal operation.
> 
> I'm probably missing something obvious, but I have looked through a
> bunch of /proc/{fs/lustre,sys/lnet,sys/lustre} entries and the
> Operations Manual and I can't actually see where the default 330s comes
> from... ???
> it seems to be quite repeatable for both OSS's and MDS's.
> 
> we're using Lustre 1.6.6 or 1.6.5.1 on servers and patchless 1.6.4.3 on
> clients with x86_64 RHEL 5.2 everywhere.
> thanks for any help!
> 
> cheers,
> robin
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list