[Lustre-devel] Recovering opens by reconstruction
    Alex Zhuravlev 
    Alex.Zhuravlev at Sun.COM
       
    Tue Jul  7 02:56:52 PDT 2009
    
    
  
Nicolas Williams wrote:
> Also, as Oleg explained to me, most open state is for files whose opens
> committed long ago, so most open state is recovered before other
> transactions.  Which means we already have a separate open state
> recovery phase -- it just isn't explicit.  So the only thing that
> changes in my proposal is that all committed open state will be
> recovered by anonymous open by FID reconstruction instead of by replay,
> with all other transactions (including as-yet uncommitted opens) will be
> recovered by replay.
I think it'd be slightly easier to introduce two notions of replay:
1) on-disk replay -- we try to recover some on-disk state from client's cache
    regular requests like mkdir, unlink, rename, setattr, etc
2) in-core replay - we try to recover some in-core state from client's cache
    ldlm locks, open files
the thing is that open(2) is quite interesting in this regard because it does
(1) *and* (2). I believe this is why we used (1) for (2).
my old thougth was that instead of introducing special new open-by-fid RPC we
should try to implement open in terms of LDLM locks because it's in-core state
(though with specific tracking of unlinked files). given this we'd automatically
get single mechanism for all in-core states and we'd get rid of special paths
for open replays.
thanks, Alex
    
    
More information about the lustre-devel
mailing list