[Lustre-devel] SOM Recovery of open files

Mon Feb 23 06:56:24 PST 2009

Please also consider the security implication.  Can all client
actions be checked without extra message passing?  Are any
special capabilities required?  To what extent must clients
be trusted?  What will go wrong if this trust is abused etc...

    Cheers,
              Eric

> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces at lists.lustre.org] On Behalf Of Andreas
> Dilger
> Sent: 21 February 2009 12:21 AM
> To: Vitaly Fertman
> Cc: Oleg Drokin; Lustre Development Mailing List
> Subject: Re: [Lustre-devel] SOM Recovery of open files
> 
> On Feb 01, 2009  20:24 +0300, Vitaly Fertman wrote:
> > On Feb 1, 2009, at 5:45 PM, Vitaly Fertman wrote:
> >> thus the only problem here is a stale fh on a client which may let the
> >> client to write to the file after the SOM cache will be re-obtained on
> >> MDS, which consists of 2 parts:
> >>
> >> - an ability of a client to write to an opened file without a
> >>   connection to MDS;
> 
> With the layout lock this would not be possible.  The client would be
> required to have the layout lock (hence be connected to the MDS) in
> order to generate a new write.
> 
> >> - an absence of file re-opening on re-connection.
> >
> > I forgot to mention about truncate (locked & lockless) and lockless IO.
> >
> > MDS must be aware about opened IOEpoch for truncate as well, otherwise
> > obd_punches must be blocked. The situation is pretty rare as we do not
> > cache punches on clients and they go away right md_setattr completes,
> > but I think what if at the time of the client eviction from MDS, the
> > connection between this client and an OST is unstable so that punches
> > will hang in the re-send list for a while, enough for another client
> > to modify the file
> 
> I a second client is trying to modify the file while the first one is
> having OST connection problems, then the first client would either
> succeed to flush its cache, or be evicted by the OST before the second
> client can get the extent locks needed to truncate the file.
> 
> The same is true whether the truncate is from a remote client (with
> client lock) or a lockless truncate (OST holds lock).
> 
> > MDS gets a new SOM cache, and later punch will modify the file.
> >
> > The same for lockless IO.
> >
> > The locked truncate is involved as it could hang in the re-send list
> > with the lock enqueue, so that enqueue+punch will happen after MDS re-
> > validates SOM cache.
> 
> In this case the client will not even begin to send the truncate RPC
> until the lock enqueue has succeeded.
> 
> > Thus:
> > - block truncate and lockless IO;
> > - "re-open" truncate on re-connection as well as regularly opened files.
> >
> > This must happen even if SOM is disabled but the client already supports
> > it (clients are upgraded first). Otherwise, the interoperability will be
> > broken.
> 
> It isn't clear to me why the done_writing RPC needs to be sent separately
> for each truncate?  The client is already sending an RPC to the MDS for
> each truncate to update the size there, if file is not open (and currently
> has no objects), and to verify file write permission (avoid truncate of
> in-use executables).
> 
> Now, if this only happens on recovery I don't have a huge objection.  If
> the "done_writing" RPC needs to be sent to the MDS for every single truncate,
> then that is a major performance concern.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel