[lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load

Patrick Farrell paf at cray.com
Wed Sep 27 09:56:08 PDT 2017

A guess for you to consider:

A very common cause of ksoftirqd load is a hypervisor putting memory pressure on a VM.  At least VMWare, and I think KVM and others, use IRQs to implement some of their memory management and it can show up like this.

That would of course mean it's not really the ptlrpc module, I'm not sure how carefully you verified that it is causing this.  (Obviously your 'remove it, check, add it, check' method is sound, but if you just checked once or twice, you may have been wrong through bad luck or you could've been right at your limit of available memory.)

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Dilger, Andreas <andreas.dilger at intel.com>
Sent: Wednesday, September 27, 2017 11:50:03 AM
To: Hans Henrik Happe
Cc: Shehata, Amir; lustre-discuss; Olaf Weber
Subject: Re: [lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load

On Sep 26, 2017, at 01:10, Hans Henrik Happe <happe at nbi.dk> wrote:
> Hi,
> Did anyone else experience CPU load from ksoftirqd after 'modprobe
> lustre'? On an otherwise idle node I see:
>    9 root      20   0     0    0    0 S 28.5  0.0  2:05.58 ksoftirqd/1
>   57 root      20   0     0    0    0 R 23.9  0.0  2:22.91 ksoftirqd/13
> The sum of those two is about 50% CPU.
> I have narrowed it down to the ptlrpc module. When I remove that, it stops.
> I also tested the 2.10.1-RC1, which is the same.

If you can run "echo l > /proc/sysrq-trigger" it will report the processes
that are currently running on the CPUs of your system to the console (and
also /var/log/messages, if it can write everything in time).

You might need to do this several times to get a representative sample of
the ksoftirqd process stacks to see what they are doing that is consuming
so much CPU.

Alternately, "echo t > /proc/sysrq-trigger" will report the stacks of all
processes to the console (and /v/l/m), but there will be a lot of them,
and no better chance that it catches what ksoftirqd is doing 25% of the time.

Cheers, Andreas
Andreas Dilger
Lustre Principal Architect
Intel Corporation

lustre-discuss mailing list
lustre-discuss at lists.lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170927/a9703547/attachment.htm>

More information about the lustre-discuss mailing list