[Lustre-discuss] 1.8.1.1

Craig Prescott prescott at hpc.ufl.edu
Thu Nov 19 11:42:49 PST 2009


Papp Tamás wrote:
> The logs are full with this:
> 
> Nov 19 20:03:32 node1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! 
> [ll_ost_80:4894]
> Nov 19 20:03:32 node1 kernel: CPU 3:
<snip>
> Nov 19 20:03:34 node1 kernel: Lustre: Skipped 40339060 previous similar 
> messages 0; still busy with 3 active RPCs

We had the same problem with 1.8.x.x.

We set lnet.printk=0 on our OSS nodes and it has helped us dramatically 
- we have not seen the 'soft lockup' problem since.

sysctl -w lnet.printk=0

This will turn off all but 'emerg' messages from lnet.

It would be interesting to know if this avoided the lockups for you, too.

Cheers,
Craig





More information about the lustre-discuss mailing list