[Lustre-devel] Queries regarding LDLM_ENQUEUE

Nicolas Williams Nicolas.Williams at Oracle.com
Wed Oct 20 10:13:49 PDT 2010


On Wed, Oct 20, 2010 at 11:00:53AM -0600, Andreas Dilger wrote:
> On 2010-10-20, at 10:46, Paul Nowoczynski <pauln at psc.edu> wrote:
> >> The name_to_handle() only needs to be called on a single node, and
> >> open_by_handle() is called on the other nodes. I agree that this
> >> doesn't avoid the full O(n) RPCs for the open itself  but at least
> >> it does avoid the full path traversal from every client and on the
> >> MDS (replacing it with an MPI broadcast of the handle).
> > 
> > excuse my ignorance, but why does open_by_handle() need to issue an
> > RPC?  If it's to obtain the layout, couldn't the layout be encoded
> > into the 'handle'?
> 
> In theory, yes. Practically, there is a size limit on the handle, and
> in large filesystems the layout is larger than this limit. 
> 
> Also, it depends on whether we want the MDS to have consistent
> behavior with the resulting open file descriptor or not.
> 
> I suppose in many cases it would be possible to fake out an open file
> on the client without telling the MDS, but then there will be strange
> problems in some cases (e.g. stat() of the file, errors on close,
> etc.) that would result since the MDS won't know anything about the
> other openers. Maybe that is acceptable, I don't know. 

Well, if we're going to add openg() (or whatever its name), we might as
well add variants of stat() that don't require getting the size when the
app doesn't need it, and forget about SOM, or forget about SOM when we
know that a file might be open by unknown clients (recover issues here).

Another possibility is that the handle encodes the current size, and
that to write past that size requires an RPC to establish open state,
but this ignores truncation.

Another possibility is to say that a handle is only good as long as the
original file descriptor remains open (recovery issues here), and that
client can tell the MDS that it will be sharing its handle with other
clients.  Or that client could tell the MDS what all the clients are
that will share that handle (recovery issues here too).

Some sort of additional RPC seems hard to avoid here, but maybe it could
be async for clients opening by handle.

Nico
-- 



More information about the lustre-devel mailing list