[Lustre-discuss] Softlockup issues. Lustre related?

Bernd Schubert bs at q-leap.de
Thu Aug 28 03:00:03 PDT 2008


Hello Alex,

On Thursday 28 August 2008 05:20:47 Alex Lee wrote:
> Hello Folks,
>
> I have few client nodes that are getting soft lockup errors. These are
> patchless clients running Lustre 1.6.5.1 with kernel
> 2.6.18-53.1.6.el5-PAPI. More or less stock RHEL 5.1 with PAPI patch added
> on it. The MDS and OSS are running Lustre 1.6.5.1 with the supplied Lustre
> kernels and OFED 1.3.1.
>
> I remember there was an issue with __d_lookup in the past but I thought it
> was fixed with the newest release of Lustre. So I dont know if this is
> related in anyway at all. I dont see any other real lustre error messages
> on the client or the MDS/OSS at the time of the softlock up. Also wasnt
> there a softirq issue? I dont think this is related to that...

according to the traces it somehow looks like there is a double locking. 
Unfortunately this is hard to debug due to lustre bug#12752.
In your traces it also looks like it might have locked up 
at "rcu_read_lock();"

While you have compiled yourself anyway, could you recompile it with debugging 
symbols and then resolve __d_lookup+0xd2 using gdb? 


Cheers,
Bernd

-- 
Bernd Schubert
Q-Leap Networks GmbH



More information about the lustre-discuss mailing list