[Lustre-devel] imperative recovery
Andreas Dilger
adilger at sun.com
Fri Jan 9 16:50:16 PST 2009
On Jan 09, 2009 09:04 -0800, Robert Read wrote:
> On Jan 9, 2009, at 07:27 , Nicholas Henke wrote:
> > This would be a great enhancement for OSS failover or reboot, it is
> > really the only way we'll get to recovery times under ~2.5 x obd_timeout.
> >
> > I do think this will miss a significant case: combo MGS+MDS. A
> > majority of our customers are deploying with this configuration.
> > Perhaps exposing this mechanism on the clients via a /proc file
> > would be enough - that way a failover framework
> > could manually trigger the timeout and/or nid switching.
>
> Yes, exactly what I was thinking. Exposing this feature via proc (or
> lctl) on the clients is the first step. It's has minimal impact,
> requires no changes to the server, and should integrate well with
> existing failover frameworks. We also need to get the server to end
> recovery sooner (without waiting for all the stale exports), but VBR
> should help with that.
Hey, wouldn't (essentially) "lctl --device $foo recover" do the trick
today?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-devel
mailing list