[Lustre-discuss] Unbalanced load across OST's
    Aaron Everett 
    aeverett at ForteDS.com
       
    Thu Mar 19 11:33:38 PDT 2009
    
    
  
Hello all,
 
We are running 1.6.6 with a shared mgs/mdt and 3 ost's. We run a set of
tests that write heavily, then we review the results and delete the
data. Usually the load is evenly spread across all 3 ost's. I noticed
this afternoon that the load does not seem to be distributed.
 
OST0000 has a load of 50+ with iowait of around 10%
OST0001 has a load of <1 with >99% idle
OST0002 has a load of <1 with >99% idle
 
>From a client all 3 OST's appear online:
 
[aeverett at englogin01 ~]$ lctl device_list
  0 UP mgc MGC172.16.14.10 at tcp 19dde65d-8eba-22b0-b618-f59bfbd36cde 5
  1 UP lov fortefs-clilov-f7cc4800 c86e2947-f2bf-5e47-541f-6ff3f13af9a0
4
  2 UP mdc fortefs-MDT0000-mdc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5
  3 UP osc fortefs-OST0000-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5
  4 UP osc fortefs-OST0001-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5
  5 UP osc fortefs-OST0002-osc-f7cc4800
c86e2947-f2bf-5e47-541f-6ff3f13af9a0 5
[aeverett at englogin01 ~]$
 
>From MGS/MDT claims Lustre is healthy:
 
[aeverett at lustrefs ~]$ cat /proc/fs/lustre/health_check 
healthy
[aeverett at lustrefs ~]$
 
df confirms the lopsided writes:
 
OST0000:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             1.2T  602G  544G  53% /mnt/fortefs/ost0
 
OST0001:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             1.2T  317G  828G  28% /mnt/fortefs/ost0
 
OST0002:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             1.2T  315G  831G  28% /mnt/fortefs/ost0
 
What else should I be checking? Has the MGS/MDT lost track of OST0001
and OST0002 somehow? Clients can still read data that is on OST0001 and
OST0002. I confirmed this using lfs getstripe and cat'ing files on those
devices. If I edit the file, the file is written to OST0000.
 
Regards,
Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090319/ec97ae45/attachment.htm>
    
    
More information about the lustre-discuss
mailing list