[Lustre-devel] Sub Tree lock ideas.
Oleg.Drokin at Sun.COM
Tue Feb 3 01:39:59 PST 2009
On Feb 3, 2009, at 4:04 AM, Andreas Dilger wrote:
>> It would be a lot of batching in many common usecases like "untar a
>> file", "Create a working files for applications, all in same dir/
>> dir tree".
> Maybe I misunderstand, but all of this batching is in the case of a
> client that is doing operations to send to the MDS. What I was
> would be a rare case is batching from the server to the client when
> a bunch of clients independently open a bunch of files that are in a
> directory for which a client holds a STL.
Right. I am speaking about aggregation at client level to send batched
to the server. (e.g. tons of creates).
> In the latter case, since all of the RPCs are coming from different
> it is much harder for the server to group them together into a
> single RPC
> to send to the STL client.
Indeed, this is much harder. (but still possible if it is just one
does readdir+ and we do a batched glimpse to a client holding some
files in that dir).
>> From the above my conclusion is we do not necessarily need SubTree
>> for efficient metadata write cache, but we do need it for other
>> scenarios (memory conservation). There are some similarities in the
>> functionality too, but also some differences.
>> One particular complexity I see with multiple read-only STLs is every
>> modifying metadata operation would need to traverse the metadata tree
>> all the way back to the root of the fs in order to notify all
>> clients holding STL locks about the change about to be made.
> Sorry, I was only considering the case of a 1-deep STL (e.g. a DIR
> not the arbitrary-depth STL you originally described). In that case,
> there is no requirement for more than a single level of STL to be
> checked/cancelled if a client is doing some modifying operation
> This is no different than e.g. if a bunch of clients are holding the
> LOOKUP lock on a directory that has a new entry in it.
The problem in this case then becomes that if we operate within a tree
16 entries deep, we have consumed 10% of our lock capacity (getting a
on every subdir in process). If we have several apps going on, then
> Eric also had a proposal that the DIR lock would be a "hash extent"
> instead of a single bit, so that it would be possible (via lock
> to avoid cancelling all of the entries cached on a client when a
> new file is being added. Only the hash range of the entry being added
> would need to be removed from the lock, either via a 3-piece lock
> (middle extent being cancelled) or via a 2-piece lock split (smallest
> extent being cancelled).
Yes, this is also possible and would be beneficial even with WRITE
lock on a dir.
But this really is completely orthogonal issue.
More information about the lustre-devel