[Lustre-discuss] Unbalanced load across OST's

Aaron Everett aeverett at ForteDS.com
Thu Mar 19 11:33:38 PDT 2009


Hello all,

 

We are running 1.6.6 with a shared mgs/mdt and 3 ost's. We run a set of
tests that write heavily, then we review the results and delete the
data. Usually the load is evenly spread across all 3 ost's. I noticed
this afternoon that the load does not seem to be distributed.

 

OST0000 has a load of 50+ with iowait of around 10%

OST0001 has a load of <1 with >99% idle

OST0002 has a load of <1 with >99% idle

 

>From a client all 3 OST's appear online:

 

[aeverett at englogin01 ~]$ lctl device_list

  0 UP mgc MGC172.16.14.10 at tcp 19dde65d-8eba-22b0-b618-f59bfbd36cde 5

  1 UP lov fortefs-clilov-f7cc4800 c86e2947-f2bf-5e47-541f-6ff3f13af9a0
4

  2 UP mdc fortefs-MDT0000-mdc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5

  3 UP osc fortefs-OST0000-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5

  4 UP osc fortefs-OST0001-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5

  5 UP osc fortefs-OST0002-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5

[aeverett at englogin01 ~]$

 

>From MGS/MDT claims Lustre is healthy:

 

[aeverett at lustrefs ~]$ cat /proc/fs/lustre/health_check 

healthy

[aeverett at lustrefs ~]$

 

df confirms the lopsided writes:

 

OST0000:

Filesystem            Size  Used Avail Use% Mounted on

/dev/sdb1             1.2T  602G  544G  53% /mnt/fortefs/ost0

 

OST0001:

Filesystem            Size  Used Avail Use% Mounted on

/dev/sdb1             1.2T  317G  828G  28% /mnt/fortefs/ost0

 

OST0002:

Filesystem            Size  Used Avail Use% Mounted on

/dev/sdb1             1.2T  315G  831G  28% /mnt/fortefs/ost0

 

What else should I be checking? Has the MGS/MDT lost track of OST0001
and OST0002 somehow? Clients can still read data that is on OST0001 and
OST0002. I confirmed this using lfs getstripe and cat'ing files on those
devices. If I edit the file, the file is written to OST0000.

 

Regards,
Aaron

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090319/ec97ae45/attachment.htm>


More information about the lustre-discuss mailing list