[Lustre-discuss] LBUG on lustre 1.8.0

Larry tsrjzq at gmail.com
Sun Nov 21 20:16:09 PST 2010


We add the "options libcfs libcfs_panic_on_lbug=1" in modprobe.conf to
make the server kernel panic ASAP the LBUG happened. Is there some way
to make the server dead a few seconds after the LBUG? We are also
puzzled with the message lost during the LBUG happened.

On Mon, Nov 22, 2010 at 10:42 AM, Kevin Van Maren
<Kevin.Van.Maren at oracle.com> wrote:
> Sure, but I think for engineering to make progress on this bug, they are
> going to want a crash dump.  If you can enable crash dumps and panic on lbug
> (and if HA, increase dead timeout so it can complete the dump before being
> shot in the head) it would provide more info for the bug report.
>
> That being said, there are quite a few other bugs that have been fixed since
> 1.8.0, so you really should upgrade ASAP to 1.8.4.
>
> Kevin
>
>
> On Nov 21, 2010, at 6:59 PM, Larry <tsrjzq at gmail.com> wrote:
>
>> We had a LBUG several days ago on our lustre 1.8.0. One OSS reported
>>
>> kernel: LustreError:
>> 24669:0:(service.c:1311:ptlrpc_server_handle_request())
>> ASSERTION(atomic_read(&(export)->exp_refcount) < 0x5a5a5a) failed
>> kernel: LustreError:
>> 24669:0:(service.c:1311:ptlrpc_server_handle_request()) LBUG
>> kernel: Lustre: 24669:0:(linux-debug.c:222:libcfs_debug_dumpstack())
>> showing stack for process 24669
>> ......
>>
>> I google for this, and find little information about it. It seems to
>> be a race condition on OSS, right? Should I open a bugzilla for this
>> LBUG?
>> Thanks.
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>



More information about the lustre-discuss mailing list