[Lustre-devel] Queries regarding LDLM_ENQUEUE

Wed Oct 20 10:29:57 PDT 2010

On Wed, Oct 20, 2010 at 09:18:59PM +0400, bzzz.tomas at gmail.com wrote:
> On 10/20/10 9:11 PM, Paul Nowoczynski wrote:
> > I could be wrong but my guess is that the network congestion caused by
> > this communication pattern is a more serious problem. The mds should be
> > able to easily service lookup rpc's since only the first few necessitate
> > a read I/O from the disk.
> 
> but then the network should be able to deal with storm of
> <max RPC in-flight> * <# clients> to read/write data?
> 
> or it's a specific switch being the bottleneck to specific node?
> 
> because if it isn't network, but MDS being a real bottleneck,
> then proxy might be a solution like Eric said above. not sure
> is this important in your case, but this would allow to use
> existing apps.

MDSes are typically CPU bound, so that's likely the issue.  The problem
though is that the MDS does need to track open file state for SOM and
for dealing with unlinks.  The semantics of open-by-handle might be such
that unlinks of files opened by handle can cause the file to disappear
and syscalls on FDs opened by handle could then return EBADF or EIO or
some new error code.  But open-by-handle semantics don't allow for that,
then the MDS needs to track open file state, and it's hard to see how to
avoid RPCs to the MDS to establish that state (the original client could
tell the MDS about all the clients that will open-by-handle, but this
seems unlikely to perform so much better than N smaller RPCs as to
justify it, and the open-by-handle API suddenly gets much more complex).

Nico
--