[Lustre-devel] (no subject)

di.wang di.wang at oracle.com
Wed Apr 28 19:04:32 PDT 2010


Andreas Dilger wrote:
> AFAIK the Handling time is when request processing starts, not when it  
> first arrived at the server. We would need to check I'd there is an  
> RPCTRACE message from ptlrpc_handle_req_in() that we could use to  
> determine the initial request arrival time.
>   
Oh, you can also get the arriving time in "Handled RPC" msg
by Handled RPC_time - total time.

Thanks
WangDi
> Cheers, Andreas
>
> On 2010-04-28, at 19:25, "di.wang" <di.wang at oracle.com> wrote:
>
>   
>> Hello, Michael
>>
>> There is a logfile parser script in the attachment, which was  
>> written by Eric.
>>
>> This script is very simple, but it should help you understand how we  
>> retrieve time step information from lustre debug log. On the server  
>> side, if you enable rpc_trace log, whenever the request arrive/start  
>> being processed/end of processing, there will be corresponding  
>> records being written into the debug log. Basically, you can get all  
>> the time step information from these records (actually only two of  
>> these records would be enough).
>>
>>
>> a. 00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c: 
>> 1625:ptlrpc_server_handle_request()) Handling RPC pname:cluuid 
>> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
>> ea7d77089b52+10:4055:x1334115493386242:12345-0 at lo:400
>>
>> This record means the req will being handled, so you can get the  
>> start time stamp(1272313858.472660) operation type (opc: 400, ping),  
>> xid (1334115493386242), client nid(12345-0 at lo) and so on.
>>
>> b. 00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c: 
>> 1672:ptlrpc_server_handle_request()) Handled RPC pname:cluuid 
>> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
>> ea7d77089b52+9:4055:x1334115493386242:12345-0 at lo:400 Request  
>> procesed in 45us (77us total) trans 0 rc 0/0
>>
>> This record means the req is already being handled, so you can get  
>> the end time stamp(1272313858.472687), operation type (opc: 400,  
>> ping), xid (1334115493386242), client nid(12345-0 at lo) and so no.
>>
>> Note: (77us total) means how long it takes from the request arriving  
>> to the end of processing. so you can also get the request arriving  
>> time stamp here by (1272313858 - 77 = 1272312781).
>>
>>
>> So with these information you can draw the graph Eric mentioned in  
>> his email. If you have any questions, please let me know.
>>
>> Thanks
>> WangDi
>>
>>
>> Eric Barton wrote:
>>     
>>> --- 
>>> ---------------------------------------------------------------------
>>>
>>> Subject:
>>> RE: Visualising Lustre RPCs
>>> To:
>>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>, "wangdi" <Tom.Wang at Sun.COM 
>>>       
>>> To:
>>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>, "wangdi" <Tom.Wang at Sun.COM 
>>>       
>>> CC:
>>> <lustre-devel at lists.lustre.org>, "Galen M. Shipman" <gshipman at ornl.gov 
>>>       
>>> Michael,
>>>
>>> The previous Lustre RPC visualisation effort I mentioned at the LUG  
>>> used the
>>>
>>> Lustre debug log entries of type D_RPCTRACE. We disabled all but  
>>> these
>>>
>>> log messages and then used the Lustre debug daemon to collect them  
>>> while
>>>
>>> we ran I/O tests. We then ran a simple logfile parser which used  
>>> just the log entries
>>>
>>> for request arrival…
>>>
>>> , start of processing and end of processing to graph request
>>>
>>> queue depth (arrival->end) and the number of requests being  
>>> serviced by type
>>>
>>> over time – e.g…
>>>
>>> read3d
>>>
>>> …which shows request queue depth (vertical) over time (axis labell 
>>> ed 20-25) by
>>>
>>> server (axis labelled 0-80).
>>>
>>> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>>> *Sent:* 17 April 2010 6:26 AM
>>> *To:* Galen M. Shipman; Eric Barton
>>> *Subject:* Visualising Lustre RPCs
>>>
>>> Hi Galen, Eric,
>>>
>>> in order to get this little project started, I think what I need at  
>>> first to
>>>
>>> write a prototype for a converter are the following things:
>>>
>>> A set of test traces collected on maybe a handful of clients and  
>>> some servers
>>>
>>> is probably a good point to start with. It would be even better if  
>>> we know
>>>
>>> what is in this traces so that we have an expectation what kind of  
>>> things
>>>
>>> we want to see on the Vampir displays. This little program that  
>>> Eric mentioned
>>>
>>> that can read the trace file would be very helpful as well. And as  
>>> the last
>>>
>>> idea I have right now, a technical contact. I might come up with a  
>>> couple
>>>
>>> of question after I have taken the first look onto the original  
>>> trace data
>>>
>>> and before I start writing code.
>>>
>>> Regards, Michael
>>>
>>>
>>> -- 
>>>
>>> Michael Kluge, M.Sc.
>>>
>>> Technische Universität Dresden
>>> Center for Information Services and
>>> High Performance Computing (ZIH)
>>> D-01062 Dresden
>>> Germany
>>>
>>> Contact:
>>> Willersbau, Room WIL A 208
>>> Phone: (+49) 351 463-34217
>>> Fax: (+49) 351 463-37773
>>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at tu-dresden.de 
>>>       
>>> WWW: http://www.tu-dresden.de/zih
>>>
>>>       
>> <rpc_parse.tgz>
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>     
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>   




More information about the lustre-devel mailing list