[Lustre-discuss] Lustre 1.6.4.1 - client lockup

Kilian CAVALOTTI kilian at stanford.edu
Fri Jan 25 12:06:51 PST 2008


Hi Niklas,

On Friday 25 January 2008 07:10:47 am Niklas Edmundsson wrote:
> We're able to consistently kill the lustre client with bonnie in
> combination with striping. 

Out of curiosity, I tried to reproduce your experiment, and didn't 
encounter any problem. All the bonnie processes ran fine.

There are a lot of significative differences between our test 
environments, but I thought it may be useful to know the results of 
your test case on a different system.

> This is Lustre 1.6.4.1, Debian 2.6.18 
> amd64 kernel with lustre patches on both server and clients 

I used Lustre 1.6.4.1, RHEL4 and 2.6.9-55.0.9.EL_lustre.1.6.4.1smp amd64 
x86_64 kernel.

> All machines are dual opterons connected with GigE.

They are Intel quad-cores (E5345) connected with IB.

> We have 5 servers, 1 MDS with 1 MGS and 1 MDT target and 4 OSS's with
> 2 OST targets (~1.2TB) each.

We have 9 servers, 1 MDS with MGS and MDT, and 8 OSSs with 2 OSTs each.

> Jan 25 11:16:23 BUG: soft lockup detected on CPU#1!

> After 10-15 minutes it locks up, this time with a bunch of
> LustreErrors before the stack trace:

They look like a network interruption problem, but it's hard to tell if 
that's the cause or the consequence. Can't that be that your Ethernet 
switches dropped some packets?

Cheers,
-- 
Kilian



More information about the lustre-discuss mailing list