[Lustre-discuss] High CPU load, only on 1 OSS
Ronald K Long
rklong at usgs.gov
Tue Nov 16 12:41:00 PST 2010
The data used on the file system is pretty transient. Files are created
and then moved off to other locations not on the lustre system.
I did look into top and do not find 1 specific process that is hogging the
cpu. Pretty much the same across each OSS. Quiet a few of these is about
it. ll_ost_io_01
The stripping seems to be going across all of the ost's correctly.
UUID bytes Used Available Use% Mounted on
lustre-MDT0000_UUID 726.2G 1.7G 683.0G 0% /san[MDT:0]
lustre-OST0000_UUID 2.4T 742.2G 1.5T 30% /san[OST:0]
lustre-OST0001_UUID 2.4T 696.6G 1.6T 28% /san[OST:1]
lustre-OST0002_UUID 2.4T 729.9G 1.5T 30% /san[OST:2]
lustre-OST0003_UUID 2.4T 736.1G 1.5T 30% /san[OST:3]
lustre-OST0004_UUID 2.4T 757.1G 1.5T 31% /san[OST:4]
lustre-OST0005_UUID 2.4T 784.7G 1.5T 32% /san[OST:5]
lustre-OST0006_UUID 2.4T 898.8G 1.4T 37% /san[OST:6]
lustre-OST0007_UUID 2.4T 762.2G 1.5T 31% /san[OST:7]
filesystem summary: 18.9T 6.0T 12.0T 31% /san
Thanks Again.
Rocky
From:
Wang Yibin <wang.yibin at oracle.com>
To:
Ronald K Long <rklong at usgs.gov>
Cc:
lustre-discuss at lists.lustre.org
Date:
11/16/2010 09:54 AM
Subject:
Re: [Lustre-discuss] High CPU load, only on 1 OSS
Hello,
Normally when stripe_offset is set to -1, MDS will do load/space balancing
automatically.
What is your use pattern of the filesystem?
It sounds like that your applications are doing extensive I/O on that
particular OSS.
To find out why the load on the OSS is so high, please
- find what processes are hogging the CPUs using top(1).
- get the stripe info of your in-use files to see whether most of them
reside on the same OSS.
If the files in use are not distributed among the OSS servers, or your
file usage pattern is one-OSS bound, you may want to consider tuning the
stripe_count/stripe_size.
在 2010-11-16,下午10:38, Ronald K Long 写道:
We recently setup a lustre config. 1 MDS 4 OSS's. Everything is running
fine except on the first OSS we are experiencing very high cpu load. The
first OSS is running a CPU load in the high 50's. The other 3 OSS's are
steady at around 8. Everything is the same between all of the OSS's.
The stripe is setup
stripe_count: 1 stripe_offset: -1
Red Hat 5 64bit
kernel-2.6.18-194.3.1.el5_lustre.1.8.4
kernel-devel-2.6.18-194.3.1.el5_lustre.1.8.4
lustre-ldiskfs-3.1.3-2.6.18_194.3.1.el5_lustre.1.8.4
lustre-1.8.4-2.6.18_194.3.1.el5_lustre.1.8.4
lustre-modules-1.8.4-2.6.18_194.3.1.el5_lustre.1.8.4
Any thing I can check on the problem OSS to rectify this issue.
Thank you in advance
Rocky
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20101116/fc7396bc/attachment.htm>
More information about the lustre-discuss
mailing list