[lustre-discuss] [EXTERNAL] good ways to identify clients causing problems?

Raj rajgautam at gmail.com
Sat May 29 10:44:10 PDT 2021


One other way is to install xltop(https://github.com/jhammond/xltop)
and use xltop client (ncurses based linux top like tool) to watch for
top client with more requests per sec (xltop -k q h).
You can also use it to track jobs but you might have to write your own
nodes to job mapping script (xltop-clusterd).

On Fri, May 28, 2021 at 4:21 PM Mohr, Rick via lustre-discuss
<lustre-discuss at lists.lustre.org> wrote:
>
> Bill,
>
> One option I have used in the past is to look at the rpc request history.  For example, on an oss server, you can run:
>
> lctl get_param ost.OSS.ost_io.req_history
>
> and then extract the client nid for each request.   Based on that, you can calculate the number of requests coming into the server and look for any clients that are significantly higher than the others.  Maybe something like:
>
> lctl get_param ost.OSS.ost_io.req_history | cut -d: -f3 | sort | uniq -c | sort -n
>
> I have used that approach in the past to identify misbehaving clients (the number of requests from such clients was usually one or two orders of magnitude higher than the others).  If multiple clients are unusually high, you may be able to correlate the nodes with currently running jobs to identify a particular job (assuming you don't already have lustre job stats enabled).
>
> -Rick
>
>
> On 5/4/21, 2:41 PM, "lustre-discuss on behalf of Bill Anderson via lustre-discuss" <lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss at lists.lustre.org> wrote:
>
>
>        Hi All,
>
>        Can you recommend good ways to identify Lustre client hosts that might be causing stability or performance problems for the entire filesystem?
>
>        For example, if a user is inadvertently doing something that's creating an RPC storm, what are good ways to identify the client host that has triggered the storm?
>
>        Thank you!
>
>        Bill
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list