[Lustre-discuss] Question about sleeping processes
Brian J. Murrell
Brian.Murrell at Sun.COM
Tue Oct 6 07:22:08 PDT 2009
On Tue, 2009-10-06 at 12:48 +0200, Michael Schwartzkopff wrote:
> Hi,
Hi,
> my system load shows that quite a number of processes are waiting.
Blocked. I guess the word waiting is similar.
> My questions are:
> What causes the problem?
In this case, the thread has lbugged previously.
If you look in syslog for node with these processes you should find
entries with LBUG and/or ASSERTION messages. These are the defects that
are causing the processes to get blocked (uninteruptable sleep)
> Can I kill the "hanging" processes?
Nope. You have to reboot the node.
Please search bugzilla for the LBUG/ASSERTIONs you are getting and if
you don't find anything that matches, please file a new bug.
> Oct 5 10:28:03 sosmds2 kernel: Lustre: 0:0:(watchdog.c:181:lcw_cb()) Watchdog
> triggered for pid 28402: it was inactive for 200.00s
> Oct 5 10:28:03 sosmds2 kernel: ll_mdt_35 D ffff81000100c980 0 28402
> 1 28403 28388 (L-TLB)
> Oct 5 10:28:03 sosmds2 kernel: ffff81041c723810 0000000000000046
> 0000000000000000 7fffffffffffffff
> Oct 5 10:28:03 sosmds2 kernel: ffff81041c7237d0 0000000000000001
> ffff81022f3e60c0 ffff81022f12e080
> Oct 5 10:28:03 sosmds2 kernel: 000177b2feff847c 00000000000014df
> ffff81022f3e62a8 000000010000028f
> Oct 5 10:28:03 sosmds2 kernel: Call Trace:
> Oct 5 10:28:03 sosmds2 kernel: [<ffffffff8008a3ef>]
> default_wake_function+0x0/0xe
> Oct 5 10:28:03 sosmds2 kernel: [<ffffffff885b1b26>]
> :libcfs:lbug_with_loc+0xc6/0xd0
Here's where you can see that the thread has lbugged.
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091006/38864093/attachment.pgp>
More information about the lustre-discuss
mailing list