[Lustre-discuss] Unbalanced load across OST's
Aaron Everett
aeverett at ForteDS.com
Fri Mar 20 08:14:56 PDT 2009
Hello, I tried the suggestion of using lfs setstripe and it appears that everything is still being written to only OST0000. You mentioned the OST's may have been deactivated. Is it possible that last time we restarted Lustre they came up in a deactivated or read only state? Last week we brought our Lustre machines offline to swap out UPS's.
[root at englogin01 teststripe]# pwd
/lustre/work/aeverett/teststripe
[root at englogin01 aeverett]# mkdir teststripe
[root at englogin01 aeverett]# cd teststripe/
[root at englogin01 teststripe]# lfs setstripe -i -1 .
[root at englogin01 teststripe]# cp -R /home/aeverett/RHEL4WS_update/ .
[root at englogin01 teststripe]# lfs getstripe *
OBDS:
0: fortefs-OST0000_UUID ACTIVE
1: fortefs-OST0001_UUID ACTIVE
2: fortefs-OST0002_UUID ACTIVE
RHEL4WS_update
default stripe_count: 1 stripe_size: 1048576 stripe_offset: 0
RHEL4WS_update/rhn-packagesws.tgz
obdidx objid objid group
0 77095451 0x498621b 0
RHEL4WS_update/rhn-packages
default stripe_count: 1 stripe_size: 1048576 stripe_offset: 0
RHEL4WS_update/kernel
default stripe_count: 1 stripe_size: 1048576 stripe_offset: 0
RHEL4WS_update/tools.tgz
obdidx objid objid group
0 77096794 0x498675a 0
RHEL4WS_update/install
obdidx objid objid group
0 77096842 0x498678a 0
RHEL4WS_update/installlinks
obdidx objid objid group
0 77096843 0x498678b 0
RHEL4WS_update/ssh_config
obdidx objid objid group
0 77096844 0x498678c 0
RHEL4WS_update/sshd_config
obdidx objid objid group
0 77096845 0x498678d 0
.............. continues on like this for about 100 files with incrementing objid numbers and obdidx = 0 and group = 0.
Thanks for all the help,
Aaron
-----Original Message-----
From: Kevin.Vanmaren at Sun.COM [mailto:Kevin.Vanmaren at Sun.COM]
Sent: Friday, March 20, 2009 8:57 AM
To: Aaron Everett
Cc: Brian J. Murrell; lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Unbalanced load across OST's
There are several things that could have been done. The most likely are:
1) you deactivated the OSTs on the MSD, using something like:
# lctl set_param ost.work-OST0001.active=0
# lctl set_param ost.work-OST0002.active=0
2) you set the file stripe on the directory to use only OST0, as with
# lfs setstripe -i 0 .
I would think that you'd remember #1, so my guess would be #2, which
could have happened when someone intended to do "lfs setstripe -c 0".
Do an "lfs getstripe ." A simple:
"lfs setstripe -i -1 ." in each directory
should clear it up going forward. Note that existing files will NOT be
re-striped, but new files will be balanced going forward.
Kevin
Aaron Everett wrote:
> Thanks for the reply.
>
> File sizes are all <1GB and most files are <1MB. For a test, I copied a typical result set from a non-lustre mount to my lustre directory. Total size of the test is 42GB. I included before/after results for lfs df -i from a client.
>
> Before test:
> [root at englogin01 backups]# lfs df
> UUID 1K-blocks Used Available Use% Mounted on
> fortefs-MDT0000_UUID 1878903960 129326660 1749577300 6% /lustre/work[MDT:0]
> fortefs-OST0000_UUID 1264472876 701771484 562701392 55% /lustre/work[OST:0]
> fortefs-OST0001_UUID 1264472876 396097912 868374964 31% /lustre/work[OST:1]
> fortefs-OST0002_UUID 1264472876 393607384 870865492 31% /lustre/work[OST:2]
>
> filesystem summary: 3793418628 1491476780 2301941848 39% /lustre/work
>
> [root at englogin01 backups]# lfs df -i
> UUID Inodes IUsed IFree IUse% Mounted on
> fortefs-MDT0000_UUID 497433511 33195991 464237520 6% /lustre/work[MDT:0]
> fortefs-OST0000_UUID 80289792 13585653 66704139 16% /lustre/work[OST:0]
> fortefs-OST0001_UUID 80289792 7014185 73275607 8% /lustre/work[OST:1]
> fortefs-OST0002_UUID 80289792 7013859 73275933 8% /lustre/work[OST:2]
>
> filesystem summary: 497433511 33195991 464237520 6% /lustre/work
>
>
> After test:
>
> [aeverett at englogin01 ~]$ lfs df
> UUID 1K-blocks Used Available Use% Mounted on
> fortefs-MDT0000_UUID 1878903960 129425104 1749478856 6% /lustre/work[MDT:0]
> fortefs-OST0000_UUID 1264472876 759191664 505281212 60% /lustre/work[OST:0]
> fortefs-OST0001_UUID 1264472876 395929536 868543340 31% /lustre/work[OST:1]
> fortefs-OST0002_UUID 1264472876 393392924 871079952 31% /lustre/work[OST:2]
>
> filesystem summary: 3793418628 1548514124 2244904504 40% /lustre/work
>
> [aeverett at englogin01 ~]$ lfs df -i
> UUID Inodes IUsed IFree IUse% Mounted on
> fortefs-MDT0000_UUID 497511996 33298931 464213065 6% /lustre/work[MDT:0]
> fortefs-OST0000_UUID 80289792 13665028 66624764 17% /lustre/work[OST:0]
> fortefs-OST0001_UUID 80289792 7013783 73276009 8% /lustre/work[OST:1]
> fortefs-OST0002_UUID 80289792 7013456 73276336 8% /lustre/work[OST:2]
>
> filesystem summary: 497511996 33298931 464213065 6% /lustre/work
>
>
>
>
>
>
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Brian J. Murrell
> Sent: Thursday, March 19, 2009 3:13 PM
> To: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Unbalanced load across OST's
>
> On Thu, 2009-03-19 at 14:33 -0400, Aaron Everett wrote:
>
>> Hello all,
>>
>
> Hi,
>
>
>> We are running 1.6.6 with a shared mgs/mdt and 3 ost’s. We run a set
>> of tests that write heavily, then we review the results and delete the
>> data. Usually the load is evenly spread across all 3 ost’s. I noticed
>> this afternoon that the load does not seem to be distributed.
>>
>
> Striping as well as file count and size affects OST distribution as well. Are any of the data involved striped? Are you writing very few large files before you measure distribution?
>
>
>> OST0000 has a load of 50+ with iowait of around 10%
>>
>> OST0001 has a load of <1 with >99% idle
>>
>> OST0002 has a load of <1 with >99% idle
>>
>
> What does lfs df say before and after such a test that produces the above results? Does it bear out even use amongst the OST before, and after the test?
>
>
>> df confirms the lopsided writes:
>>
>
> lfs df [-i] from a client is usually more illustrative of use. As I say above, if you can quiesce the filesystem for the test above, do an lfs df; lfs df -i before the test and after. Assuming you were successful in quiescing, you should see the change to the OSTs that your test effected.
>
>
>> OST0000:
>>
>> Filesystem Size Used Avail Use% Mounted on
>>
>> /dev/sdb1 1.2T 602G 544G 53% /mnt/fortefs/ost0
>>
>
> What's important is what it looked like before the test too. Your test could have, for example, wrote a single object (i.e. file) of nearly 300G for all we can tell from what you've posted so far.
>
> b.
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
More information about the lustre-discuss
mailing list