[Lustre-devel] imperative recovery
Nicholas Henke
nic at cray.com
Fri Jan 9 07:27:53 PST 2009
Nathaniel Rutman wrote:
> Eric Barton wrote:
>>> Other options I've thought of to explore this idea:
>>>
>>> - MGS notifies clients (somehow) after a server has restarted.
>>>
> This seems like a no-brainer easy win today, and doesn't depend on any
> advanced features like message priority. The only scalability issue
> would seem to be the broadcast of the message to all clients, but this
> is no different than the current broadcast mechanism the MGS employs to
> update client configs. The message from the MGS would be taken as a
> suggestion, "Why don't y'all time out all your current RPCs since I
> noticed OST0004 restarted. Oh, and use failover nid #2." Current
> replay/recovery need not be touched.
This would be a great enhancement for OSS failover or reboot, it is really the
only way we'll get to recovery times under ~2.5 x obd_timeout. Adaptive Timeouts
really aren't buying us much here, as at scale and under load we are seeing the
timeouts approach the usual static obd_timeout of 300s. It only takes one client
with a higher timeout to push the recovery time out.
I do think this will miss a significant case: combo MGS+MDS. A majority of our
customers are deploying with this configuration. Perhaps exposing this mechanism
on the clients via a /proc file would be enough - that way a failover framework
could manually trigger the timeout and/or nid switching.
Nic
More information about the lustre-devel
mailing list