[Lustre-devel] Recovering opens by reconstruction

Tue Jul 7 09:03:17 PDT 2009

On Tue, Jul 07, 2009 at 01:56:52PM +0400, Alex Zhuravlev wrote:
> I think it'd be slightly easier to introduce two notions of replay:

As I understand it 'replay' has a very specific meaning: re-send an RPC
with the 'replay' bit set in the ptlrpc header.

> [...]
> my old thougth was that instead of introducing special new open-by-fid
> RPC we should try to implement open in terms of LDLM locks because
> it's in-core state (though with specific tracking of unlinked files).
> given this we'd automatically get single mechanism for all in-core
> states and we'd get rid of special paths for open replays.

Hmmm, but open by FID gives the MDS a chance to check capabilities.
Yes, that's probably not terribly important as long as the OSSes also
check capabilities.

Also, there's the unlink issue to worry about.  Mikhail's proposal for
that is to defer unlinks until after open state recovery (in this case:
until after DLM recovery).  That would work, I think.  Also, you could
have the kind of DLM locks used for open state tracking recovered first,
then transactions, then all other types of locks.

Here's a question: what consumes more memory on the MDS: open state or a
DLM lock?

Nico
--