[Lustre-discuss] Fwd: lustre 1.6.3 prod - client kernel panic

Wojciech Turek wjt27 at cam.ac.uk
Wed Nov 7 07:35:52 PST 2007


Hi,

What Kernel version you are using for your production Lustre? Is it  
pre-patched 2.6.9-55.0.9.EL_lustre.1.6.3smp ?
We had very similar problem with bnx2 driver on old kernel 2.6.9.  
Upgrading kernel to latest version helped. The bnx2 issue is  
mentioned in the kernel's release notes.

Best regards,

Wojciech
On 7 Nov 2007, at 15:19, Matt wrote:

> I think you guys are spot on, this client is running on a Dell 1950.
>
> I'm in the process of rebuilding the box with the Centos 5 kernel  
> along with the source so I can patch it with lustre to build my own  
> kernel and obviously build the bnx2 driver.
>
> I would obviously rather just use the lustre rpm's like I have been  
> doing, but was unable to build the bnx2 driver against the lustre  
> kernel source.  I'm guessing because it's not complete.
>
> If anyone knows any different then please say so.
>
> Cheers,
>
> Matt
>
> On 07/11/2007, Balagopal Pillai <pillai at mathstat.dal.ca> wrote:
> Hi,
>
>             Looks like the broadcom driver issue to me too. I had the
> similar problem with kernel panic
> for the Lustre kernel on Dell pe1950 with Centos 4.5 and updating the
> bnx2 driver solved the issue.
> Also the interface has the tendency to drop frames. Increasing the rx
> ring parameters with ethtool
> fixes that problem.
>
>
> Regards
> Balagopal
>
> Bernd Schubert wrote:
> > Hi Matt,
> >
> > On Wednesday 07 November 2007 12:50:32 Matt wrote:
> >
> >> Hi folks,
> >>
> >> Built servers with Centos 5, and installed the lustre rhel5_x86  
> rpms
> >> successfully, modified grub and booted in to the new kernel.
> >>
> >> Configured a mgs/mdt two OSTs and a client.
> >>
> >> Now when running iozone from the client I always receive a  
> kernel panic in
> >> the lines of:
> >>
> >> Call trace: <IRQ> [<ffffffff88149e4b>] : bnx2:bnx2_start_xmt 
> +0x49/0x4d8
> >> Code: 49 8b 85 e8 00 00 00
> >> RIP [<ffffffff88146e91>] : bnx2:bnx2_poll+0xf7/0xb75 RSP <......>
> >> CR2: 0.....
> >>  <0> Kernel Panic - not syncing : Fatal Exception
> >>
> >> I have attached a screenshot.
> >>
> >> When I bounce the client and remount I can access the lustre fs  
> fine, seems
> >> to be purely a client issue.
> >>
> >> Any ideas?
> >>
> >
> > looks very much like a bug in the broadcom nx2 driver. I would  
> send it to
> > linux-netdev. And as a first try I would check whats happens when  
> you disable
> > MSI for this driver.
> >
> > Cheers,
> > Bernd
> >
> >
> >
> >
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Mr Wojciech Turek
Assistant System Manager
University of Cambridge
High Performance Computing service
email: wjt27 at cam.ac.uk
tel. +441223763517



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20071107/f3f000a7/attachment.htm>


More information about the lustre-discuss mailing list