[Lustre-discuss] Possible out of memory condition
Andreas Dilger
adilger at sun.com
Mon Oct 27 12:38:53 PDT 2008
On Oct 27, 2008 10:16 -0600, Craig Tierney wrote:
> Andreas Dilger wrote:
>> Note that soft lockups are only a warning. It shouldn't mean that the
>> node is completely dead, only that some thread was hogging the CPU.
>
> The two soft lockup messages (one in kswapd0 and the other in the user
> process convert_emiss) repeated their messages for 6 hours before I rebooted
> the node. I don't recall if I could login to the node or not.
Ah, then the spewing of the "warning" messages is likely what caused the
node to be unusable :-(. Console messages are printed with all interrupts
disabled and can be a problem in such cases. Unfortunately, this printing
is outside of the Lustre code so we can't fix it without patching the kernel.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list