[Lustre-discuss] Fragmented I/O

Wed May 11 17:07:14 PDT 2011

Hi, I'm having some performance issues on my Lustre filesystem and it 
looks to me like it's related to I/Os getting fragmented before being 
written to disk, but I can't figure out why.  This system is RHEL5, 
running Lustre 1.8.4.

All of my OSTs look pretty much the same-

                            read      |     write
pages per bulk r/w     rpcs  % cum % |  rpcs  % cum %
1:                   88811  38  38   | 46375  17  17
2:                    1497   0  38   | 7733   2  20
4:                    1161   0  39   | 1840   0  21
8:                    1168   0  39   | 7148   2  24
16:                    922   0  40   | 3297   1  25
32:                    979   0  40   | 7602   2  28
64:                   1576   0  41   | 9046   3  31
128:                  7063   3  44   | 16284   6  37
256:                129282  55 100   | 162090  62 100

                            read      |     write
disk fragmented I/Os   ios   % cum % |  ios   % cum %
0:                   51181  22  22   |    0   0   0
1:                   45280  19  42   | 82206  31  31
2:                   16615   7  49   | 29108  11  42
3:                    3425   1  50   | 17392   6  49
4:                  110445  48  98   | 129481  49  98
5:                    1661   0  99   | 2702   1  99

                            read      |     write
disk I/O size          ios   % cum % |  ios   % cum %
4K:                  45889   8   8   | 56240   7   7
8K:                   3658   0   8   | 6416   0   8
16K:                  7956   1  10   | 4703   0   9
32K:                  4527   0  11   | 11951   1  10
64K:                114369  20  31   | 134128  18  29
128K:                 5095   0  32   | 17229   2  31
256K:                 7164   1  33   | 30826   4  35
512K:               369512  66 100   | 465719  64 100

Oddly, there's no 1024K row in the I/O size table...

...and these seem small to me as well, but I can't seem to change them. 
Writing new values to either doesn't change anything.

# cat /sys/block/sdb/queue/max_hw_sectors_kb
320
# cat /sys/block/sdb/queue/max_sectors_kb
320

Hardware in question is DELL PERC 6/E and DELL PERC H800 RAID 
controllers, with MD1000 and MD1200 arrays, respectively.

Any clues on where I should look next?

Thanks,

Kevin

Kevin Hildebrand
University of Maryland, College Park
Office of Information Technology