[Lustre-discuss] disk fragmented I/Os

Lu Wang wanglu at ihep.ac.cn
Mon Apr 5 23:47:51 PDT 2010


Thanks for your explanation.  Now I am much more clear about what this parameters' meaning.  I am monitoring the /proc/
I/O tunables these days. On the Lustre clients, these is a /proc/options called "offset_stats". According to my observation, this Stat refreshes itself every few seconds---Because nomatter how long you observe this "offset_stats", the result is always of same size( several hundred lines). An Example output of our applicaiton is as below:

cat /proc/fs/lustre/llite/bes3fs-f7f34600/offset_stats |sort
  R      18245     1235222528     1236271104           1048576           1048576       -2097152
  R      18245     1235222528     1237319680           2097152           2097152       -3145728
  R      18245     1235222528     1238368256           3145728           3145728       -4194304
  R      18245     1235222528     1239416832           4194304           4194304       -5242880
  R      18245     1235222528     1240465408           5242880           5242880       -6291456
  R      18245     1241513984     1243611136           2097152           2097152        5242880
  R      18245     4207371795     4207935488            563693            563693              0
  R      31286     2344747242     2345359684            612442            612442              0
  R      31286      731906048      732954624           1048576           1048576       -2097152
  R      31286      731906048      734003200           2097152           2097152       -3145728
  R      31286      731906048      735051776           3145728           3145728       -4194304
  R      31286      731906048      736100352           4194304           4194304       -5242880
  R      31286      731906048      737148928           5242880           5242880       -6291456
  R      31286      738197504      740294656           2097152           2097152        5242880
R/W        PID    RANGE START      RANGE END   SMALLEST EXTENT    LARGEST EXTENT         OFFSET
snapshot_time:         1270536172.926198 (secs.usecs)
  W      17041      428613209      428867584            254375            254375              0
  W      17041     4544959606     4545039667             80061             80061        -617354
  W      17041     4546233710     4546421474            187764            187764        -391826
  W      17041     4546233710     4546625536            391826            391826              0
  W      17041     4546813300     4547362320            549020            549020        -860812
  W      17041     4546813300     4547674112            860812            860812         391826
  W      17041     4548223132     4548428237            205105            205105        -499556
  W      17041     4548223132     4548722688            499556            499556         860812
  W      17041     4548927793     4549517473            589680            589680         499556
  W      17621     4713119986     4713589595            469609            469609        -229134
  W      17621     4713818729     4713840455             21726             21726        -578967
  W      17621     4713818729     4714397696            578967            578967         229134
  W      17621     4714419422     4715446272             92387            934463         578967
  W      17621     4715353885     4715850698            496813            496813         -92387
  W      17621     4715943085     4716093291            150206            150206        -551763
  W      17621     4715943085     4716494848            551763            551763          92387
  W      18245     1235222528     1241513984           6291456           6291456              0
  W      18245     4202565256     4203013298            448042            448042        -127352
  W      18245     4203140650     4203741184             10424            590110         127352
  W      18245     4203730760     4204662379            931619            931619         -10424
  W      18245     4204672803     4204789760            116957            116957          10424
  W      18245     4204672803     4205255964            583161            583161        -116957
  W      18245     4205372921     4205838336            465415            465415         116957
  W      18245     4205372921     4206327006            954085            954085        -465415
  W      18245     4206792421     4206886912             94491             94491         465415
  W      18245     4207371795     4207548337            176542            176542       -1048576
  W      18245     4207371795     4208420371           1048576           1048576        -563693
  W      18245     4209160606     4209746813            586207            586207        1612269
  W      30800     2361474386     2361934283            459897            459897        -967342
  W      30800     2362901625     2363490304            588679            588679         967342
  W      30800     2362901625     2363743699            842074            842074        -588679
  W      30800     2364332378     2364538880            206502            206502         588679
  W      30800     2364332378     2364835485            503107            503107        -206502
  W      30800     2365041987     2365104344             62357             62357        -545469
  W      30800     2365041987     2365587456            545469            545469         206502
  W      30800     2365649813     2366636032            359449            626770         545469
  W      30800     2366276583     2366628031            351448            351448        -359449
  W      30854     1589868539     1589981263            112724            112724        -821253
  W      30854     1590802516     1591738368            337997            597855         821253
  W      30854     1591400371     1591451153             50782             50782       -1048576
  W      30854     1591400371     1592448947           1048576           1048576        -337997
  W      30854     1592837726     1593835520            415850            581944        1386573
  W      30854     1593419670     1593714126            294456            294456        -415850
  W      30854     1594129976     1594204276             74300             74300       -1048576
  W      30854     1594129976     1594884096            754120            754120         415850
  W      30854     1594129976     1595178552           1048576           1048576        -754120
  W      30854     1596006972     1596058929             51957             51957       -1048576
  W      30854     1596006972     1596981248            974276            974276        1802696
  W      30854     1596006972     1597055548           1048576           1048576        -974276
  W      30854     1598081781     1598542226            460445            460445        -996619
  W      30854     1598081781     1599078400            996619            996619        2022852
  W      30916     1838846205     1839110629            264424            264424        -356099
  W      30916     1839466728     1839610532            143804            143804        -784152
  W      30916     1839466728     1840250880            784152            784152         356099
  W      30916     1840394684     1840925257            530573            530573        -904772
  W      30916     1840394684     1841299456            904772            904772         784152
  W      30916     1841830029     1841946671            116642            116642        -518003
  W      30916     1841830029     1842348032            518003            518003         904772
  W      30916     1842464674     1843396608            213116            718818         518003
  W      30916     1843183492     1843432408            248916            248916        -213116
  W      30916     1843645524     1844279233            633709            633709         213116
  W      31286     2241375450     2241836184            460734            460734     -136642272
  W      31286     2345359680     2345649724            290044            290044        -304832
  W      31286     2345359680     2345664512            304832            304832             -4
  W      31286     2345954552     2346666941            712389            712389         304828
  W      31286     2346666937     2346713088             46151             46151             -4
  W      31286     2346666937     2347551880            884943            884943         -46151
  W      31286     2347598027     2347761664            163637            163637      -31816608
  W      31286     2347598027     2348055783            457756            457756        -163637
  W      31286     2377380988     2378017722            636734            636734        -789380
  W      31286     2377380988     2378170368            789380            789380       29829108
  W      31286     2378807102     2379218944            411842            411842      136970918
  W      31286     2378807102     2379414635            607533            607533        -411842
  W      31286     2379826477     2379975747            149270            149270        -441043
  W      31286     2379826477     2380267520            441043            441043       31770694
  W      31286     2380416790     2381128603            711813            711813         441043
  W      31286      731906048      738197504           6291456           6291456              0
  W      32016      183598180      184549376            951196            951196              0
  W      32016     2168397607     2168937339            539732            539732         -57561
  W      32016     2168994896     2169503744            508848            508848          57557
  W      32016     2168994896     2169906334            911438            911438        -508848
  W      32016     2201732201     2202009600            277399            277399       31825867
  W      32016     2201732201     2202165235            433034            433034        -277399
  W      32016     2203037998     2203058176             20178             20178              0
  W      32016     2203037998     2203802011            764013            764013         -20178
  W      32016     2203822189     2204106752            284563            284563          20178
  W      32016     2203822189     2204136741            314552            314552        -284563
  W      32016     2204421304     2205135835            714531            714531        -734024
  W      32016     2204421304     2205155328            734024            734024         284563
  W      32016     2205869859     2206203904            334045            334045         734024
  W      32016     2205869859     2206468833            598974            598974        -334045
  W      32016     2206802878     2206963345            160467            160467        -449602
  W      32016     2206802878     2207252480            449602            449602         334045
  W      32016     2207412947     2208126753            713806            713806         449602
  W       5313              0             63                63                63         -13977
  W        761              0             63                63                63         -13976
 
 Dose that mean  our applications seek a lot when they are writing? I think the Range means the position in a certain file. and Extend means every sequential I/O size.  

(SMALLEST EXTENT)=(LARGEST EXTENDT)=(RANGE END-RANGE START), does that mean the application seeks every time after it writes something to a file ?


----------------				 
Lu Wang
2010-04-06

-------------------------------------------------------------
发件人:Brian J. Murrell
发送日期:2010-04-05 21:01:33
收件人:lustre-discuss
抄送:
主题:Re: [Lustre-discuss] disk fragmented I/Os

On Fri, 2010-04-02 at 17:45 +0800, Lu Wang wrote: 
> Hi, 
> 	We set up a test file system with same patrition and same hardware. When the system is empty, the disk I/O is less fragemented.

So, I think you now have confirmation as to what's causing your disk I/O
fragmentation problem on your production system, yes?

> However, the disk I/O in flight is still low (mostly at "1"). Is there any way to increase this value through configuration? 

I think you are chasing a red herring.  The number of disk I/Os in
flight is only an indicator as to what could be wrong when other things
are not working correctly.

But as you can see from your brw_stats, the only items that are not
absolutely *perfect* are that 2% of your disk I/Os were fragmented:

> read      |     write
> disk fragmented I/Os   ios   % cum % |  ios   % cum %
> 1:                       0   0   0   | 2289  97  97
> 2:                       0   0   0   |   69   2 100

and 4% of your disk I/Os were not a full 1M.

> read      |     write
> disk I/O size          ios   % cum % |  ios   % cum %
> 8K:                      0   0   0   |    1   0   0
> 16K:                     0   0   0   |    1   0   0
> 32K:                     0   0   0   |    0   0   0
> 64K:                     0   0   0   |    1   0   0
> 128K:                    0   0   0   |    4   0   0
> 256K:                    0   0   0   |   12   0   0
> 512K:                    0   0   0   |   58   2   3
> 1M:                      0   0   0   | 2350  96 100

I would think those two small deviations from perfect would be within
the realm of acceptable, yes?  If you agree then you need to stop
chasing the disk I/Os in flight.  One likely explanation is simply that
the disk(s) is(are) able to drain the I/Os in flight as fast as the
OST(s) is(are) able to push them -- which is good!

Cheers,
b.


_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


More information about the lustre-discuss mailing list