[lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load
Patrick Farrell
paf at cray.com
Wed Sep 27 09:56:08 PDT 2017
A guess for you to consider:
A very common cause of ksoftirqd load is a hypervisor putting memory pressure on a VM. At least VMWare, and I think KVM and others, use IRQs to implement some of their memory management and it can show up like this.
That would of course mean it's not really the ptlrpc module, I'm not sure how carefully you verified that it is causing this. (Obviously your 'remove it, check, add it, check' method is sound, but if you just checked once or twice, you may have been wrong through bad luck or you could've been right at your limit of available memory.)
________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Dilger, Andreas <andreas.dilger at intel.com>
Sent: Wednesday, September 27, 2017 11:50:03 AM
To: Hans Henrik Happe
Cc: Shehata, Amir; lustre-discuss; Olaf Weber
Subject: Re: [lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load
On Sep 26, 2017, at 01:10, Hans Henrik Happe <happe at nbi.dk> wrote:
>
> Hi,
>
> Did anyone else experience CPU load from ksoftirqd after 'modprobe
> lustre'? On an otherwise idle node I see:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 9 root 20 0 0 0 0 S 28.5 0.0 2:05.58 ksoftirqd/1
>
>
> 57 root 20 0 0 0 0 R 23.9 0.0 2:22.91 ksoftirqd/13
>
> The sum of those two is about 50% CPU.
>
> I have narrowed it down to the ptlrpc module. When I remove that, it stops.
>
> I also tested the 2.10.1-RC1, which is the same.
If you can run "echo l > /proc/sysrq-trigger" it will report the processes
that are currently running on the CPUs of your system to the console (and
also /var/log/messages, if it can write everything in time).
You might need to do this several times to get a representative sample of
the ksoftirqd process stacks to see what they are doing that is consuming
so much CPU.
Alternately, "echo t > /proc/sysrq-trigger" will report the stacks of all
processes to the console (and /v/l/m), but there will be a lot of them,
and no better chance that it catches what ksoftirqd is doing 25% of the time.
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170927/a9703547/attachment.htm>
More information about the lustre-discuss
mailing list