[Lustre-devel] Lustre RPC visualization

di.wang di.wang at oracle.com
Mon May 3 12:52:57 PDT 2010

Michael Kluge wrote:
>>> One more question: RPC 1334380768266400 (in the log WangDi sent me)
>>> has on the client side only a "Sending RPC" message, thus missing the
>>> "Completed RPC". The server has all three (received,start work, done
>>> work). Has this RPC vanished on the way back to the client? There is
>>> no further indication what happend. The last timestamp in the client
>>> log is:
>>> 1272565368.228628
>>> and the server says it finished the processing of the request at:
>>> 1272565281.379471
>>> So the client log has been recorded long enough to contain the
>>> "Completed RPC" message for this RPC if it arrived ever ...
>> Logically, yes. But in some cases, some debug logs might be abandoned
>> for some reasons(actually, it happens not rarely), and probably you need
>> maintain an average time from server "Handled RPC" to client "Completed
>> RPC", then you just guess the client "Completed RPC" time in this case.
> Oh my gosh ;) I don't want to start speculations about the helpfulness 
> of incomplete debug logs. Anyway, what can get lost? Any kind of message 
> on the servers and clients?  I think I'd like to know what cases have to 
> be handled while I try to track individual RPC's on their way.
Any records can get lost here. Unfortunately, there are not any messages 
indicate the missing happened. :(
(Usually, I would check the time stamp in the log, i.e. no records for a 
"long" time, for example several seconds, but this is not the accurate way).

I guess you can just ignore these uncompleted records in your first 
step?  Let's see how these incomplete log will
impact the profiling result, then we will decide how to deal with this?

> Regards, Michael
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

More information about the lustre-devel mailing list