[Lustre-discuss] help tracking down extremely high loads on OSSs

Peter Kjellstrom cap at nsc.liu.se
Mon Oct 18 10:49:50 PDT 2010


On Monday 18 October 2010, John White wrote:
> Hello Folks,
> 	A while back (say 3 weeks ago) we started noticing extremely high loads
> (load avg around 300 at times) on our OSSs when in production and serving
> IO.  This cluster was, at the time, on 1.8.2 (we have since upgraded to
> 1.8.4 but the problem remains).  The load increases fairly predictably as
> clients generate IO but even 2 clients can produce a load avg above 5.00. 

Does this impact performance or does it only show up as an unexpectedly high 
number on the OSSes?

/Peter

> An identical file system of ours does not exhibit this behavior (sticks
> below load avg 1.00 under even the heaviest IO load).  I've looked around
> bugzilla and haven't found anything.  We've disabled heartbeat on the
> off-chance that was generating the load (it's not), we've attempted using a
> different client transport (o2ib->tcp), this did not solve the issue. 
> There doesn't appear to be any specific non-kernel thread causing the
> high-load.  The only info in dmesg/syslog pertains to sporadic client
> evictions or sporadic slow setattr due to heavy IO load (we've since tuned
> the number of OST threads).  We're basically out of ideas to try.
>
> As reference, this is a 1 MDS/4 OSS cluster backed by a DDN 9900 couplet
> (15 tiers, 1:1 lun mapping) running the lustre.org rpm build kernel for
> 1.8.4.  The MDS/OSSs are Dell R710s and the MDT is a Dell MD1000.  Is this
> a common problem or should a bug be filed?  Any info available upon
> request.  Thanks for your time. ----------------
> John White
> High Performance Computing Services (HPCS)
> (510) 486-7307
> One Cyclotron Rd, MS: 50B-3209C
> Lawrence Berkeley National Lab
> Berkeley, CA 94720
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20101018/e6b271fb/attachment.pgp>


More information about the lustre-discuss mailing list