[Lustre-discuss] Softlockup issues. Lustre related?

Alex Lee alee at datadirectnet.com
Thu Aug 28 05:52:22 PDT 2008


Bernd Schubert wrote:
> Hello Alex,
>
> On Thursday 28 August 2008 05:20:47 Alex Lee wrote:
>   
>> Hello Folks,
>>
>> I have few client nodes that are getting soft lockup errors. These are
>> patchless clients running Lustre 1.6.5.1 with kernel
>> 2.6.18-53.1.6.el5-PAPI. More or less stock RHEL 5.1 with PAPI patch added
>> on it. The MDS and OSS are running Lustre 1.6.5.1 with the supplied Lustre
>> kernels and OFED 1.3.1.
>>
>> I remember there was an issue with __d_lookup in the past but I thought it
>> was fixed with the newest release of Lustre. So I dont know if this is
>> related in anyway at all. I dont see any other real lustre error messages
>> on the client or the MDS/OSS at the time of the softlock up. Also wasnt
>> there a softirq issue? I dont think this is related to that...
>>     
>
> according to the traces it somehow looks like there is a double locking.
> Unfortunately this is hard to debug due to lustre bug#12752.
> In your traces it also looks like it might have locked up
> at "rcu_read_lock();"
>
> While you have compiled yourself anyway, could you recompile it with debugging
> symbols and then resolve __d_lookup+0xd2 using gdb?
>
>
> Cheers,
> Bernd
>
> --
> Bernd Schubert
> Q-Leap Networks GmbH
>   
Someone found this bug for me that looks very similar.

https://bugzilla.lustre.org/show_bug.cgi?id=15975

Does this look anything close? I'm pretty clueless about debugging 
kernel traces.

Thanks,
-Alex




More information about the lustre-discuss mailing list