[Lustre-discuss] slow journal/commitrw on OSTs lead to crash

Brian J. Murrell Brian.Murrell at Sun.COM
Wed Apr 8 11:49:11 PDT 2009


On Wed, 2009-04-08 at 13:16 -0500, Hendelman, Rob wrote:
> Hello,

Hi,

> I recently upgraded our 1.6.4.3 centos servers to 1.6.7.  Clients are
> still at 1.6.4.3.  We were going to upgrade to 1.6.7 but decided to wait
> until 1.6.7.1 comes out with the fix for the bug mentioned on the list
> late last week.

The corruption bug for which 1.6.7.1 will be released affects the MDT,
not the clients, so you have some exposure there still, even without
upgrading the clients.

> Apr  7 18:08:48 maglustre04 kernel: Lustre:
> 5228:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) fs01-OST0005: slow
> journal start 31s

This basically means the storage devices are taking exceptionally long
to process requests.  One common cause of this is over-subscription of
OST threads.  Did you run the obdfilter-survey on your disk subsystem
before installing Lustre on it?  If so, it should tell you at what kind
of thread count you saturate your storage.  You want to adjust your ost
thread count according to that number.

If you didn't run the obdfilter-survey you will have to do some trial
and error (i.e. a binary search) with OST thread counts to find a happy
place where you are no longer over-subscribed but you are also not
suffering less performance.

Of course, OST thread count is relative to storage backend performance.
If that performance suffers, then thread counts which were once
sufficient could now be over-subscribed.

> Also: If anyone over there @ Sun has a contact of someone in sales I can
> call for a commercial support contract, this is something we are
> interested in looking at as well.

http://www.sun.com/software/products/lustre/support.xml

Let me know if that doesn't help you out in that endeavour.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090408/7d98487d/attachment.pgp>


More information about the lustre-discuss mailing list