[Lustre-discuss] Possible out of memory condition

Andreas Dilger adilger at sun.com
Mon Oct 27 12:38:53 PDT 2008


On Oct 27, 2008  10:16 -0600, Craig Tierney wrote:
> Andreas Dilger wrote:
>> Note that soft lockups are only a warning.  It shouldn't mean that the
>> node is completely dead, only that some thread was hogging the CPU.
>
> The two soft lockup messages (one in kswapd0 and the other in the user
> process convert_emiss) repeated their messages for 6 hours before I rebooted
> the node.  I don't recall if I could login to the node or not.

Ah, then the spewing of the "warning" messages is likely what caused the
node to be unusable :-(.  Console messages are printed with all interrupts
disabled and can be a problem in such cases.  Unfortunately, this printing
is outside of the Lustre code so we can't fix it without patching the kernel.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list