[Lustre-discuss] Frequent OSS Crashes with heavy load

Brian J. Murrell Brian.Murrell at Sun.COM
Mon Nov 10 08:55:53 PST 2008


On Mon, 2008-11-10 at 16:42 +0000, Wang lu wrote:
> I have already 512(max number) IO thread running. Some of them are of "Dead"
> status. Is it safe to draw conclusion that the OSS is oversubscribed? 

Until you do some analysis of your storage with the iokit, one cannot
really draw any conclusions, however if you are already at the maximum
value of OST threads, it would not be difficult to believe that perhaps
this is a possibility.

Try a simple experiment and half the number to 256 and see if you have
any drop off in throughput to the storage devices.  If not, then you can
easily assume that 512 was either too much or not necessary.  You can
try doing this again if you wish.  If you get to a value of OST threads
where your throughput is lower than it should be, you've gone too low.

But really, the iokit is the more efficient and accurate way to
determine this.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20081110/af7f384a/attachment.pgp>


More information about the lustre-discuss mailing list