[Lustre-devel] Queries regarding LDLM_ENQUEUE
andreas.dilger at oracle.com
Wed Oct 20 11:27:25 MDT 2010
On 2010-10-20, at 11:18, bzzz.tomas at gmail.com wrote:
> On 10/20/10 9:11 PM, Paul Nowoczynski wrote:
>> I could be wrong but my guess is that the network congestion caused by
>> this communication pattern is a more serious problem. The mds should be
>> able to easily service lookup rpc's since only the first few necessitate
>> a read I/O from the disk.
> but then the network should be able to deal with storm of
> <max RPC in-flight> * <# clients> to read/write data?
> or it's a specific switch being the bottleneck to specific node?
I think there is definitely non-trivial overhead of the MDS threads descending into the filesystem to do path lookup and permission checking than would be avoided.
> because if it isn't network, but MDS being a real bottleneck,
> then proxy might be a solution like Eric said above. not sure
> is this important in your case, but this would allow to use
> existing apps.
> of course, distribution tree for a handle may scale better.
I don't think the actual distribution of the handle is a significant factor (this can be done via efficient broadcast in MPI layer). If we want to keep the MDS state consistent with N openers of the file then that may take more effort. However, I also just thought of a partial solution to the MDS state issue - if the original client doing name_to_handle() also gets the MDS open lock, then it can somewhat act as "proxy" for the remaining clients that are opening via handle.
The MDS will know that the client with the MDS open lock may be doing other opens, and if the handle also contains the layout as Paul proposed, then it seems possible to get at least a reasonable representation of the file on each client w/o having an additional MDS RPC from each one. Those clients may still have issues if contacting the MDS for that file, but maybe not.
Actually implementing this is left as an exercise for the reader...
Lustre Technical Lead
Oracle Corporation Canada Inc.
More information about the Lustre-devel