[lustre-discuss] [External] Re: obdfilter/mdt stats meaning ?

Louis Bailleul Louis.Bailleul at pgs.com
Tue Jul 16 08:49:04 PDT 2019


Hi Aurélien,

Thanks for the prompt reply.
For the ost stats, any idea what the preprw and commitrw mean ?
And why there are two entries with different values for statfs ?

For brw_stats even with the doc I still struggle to read this.
For example how do you make sense of disk I/O in flight ?
                           read      |     write
disk I/Os in flight    ios   % cum % |  ios         % cum %
1:               211177215  61  61   | 29305564  97  97
2:                41332944  11  72   | 498260   1  99
[..]
Does these lines means :
Since last snapshot there was 211177215x1 and read 41332944x2 I/O in flight ?

Best regards,
Louis

On 16/07/2019 15:50, Degremont, Aurelien wrote:
Hi Louis,

About brw_stats, there are a bit of explanation in the Lustre Doc (not that detailed, but still)
http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438271_55057

> Last thing, is there any way to get the name of the filesystem an OST is part of by using lctl ?

I don't know what you want exactly, but the OST names are self explanatory, there always are like: fsname-OSTXXXX
Where fsname is the lustre filesystem they are part of.

For obdfilter stats, these are mostly action to OST objects or client connection management RPCs.

    setattr: changing an OST object attributes (owner, group, ...)
    punch: mostly used for truncate (theorically can do holes in files, like truncate with a start and length)
    sync: straighforward, sync OST to disk
    destroy: delete an OST object (mostly when a file is deleted)
    create: create an OST object
    statfs: like 'df' for this specific OST (used by 'lfs df' by example)
    (re)connect: when a client connect/reconnect to this OST
    ping: when a client ping this OST.


Aurélien

De : lustre-discuss <lustre-discuss-bounces at lists.lustre.org><mailto:lustre-discuss-bounces at lists.lustre.org> au nom de Louis Bailleul <Louis.Bailleul at pgs.com><mailto:Louis.Bailleul at pgs.com>
Date : mardi 16 juillet 2019 à 16:38
À : lustre-discuss <lustre-discuss at lists.lustre.org><mailto:lustre-discuss at lists.lustre.org>
Objet : [lustre-discuss] obdfilter/mdt stats meaning ?

Hi all,

I am trying to make sense of some of the OST/MDT stats for 2.12.
Can anybody point me to the doc that explain what the metrics are ?
The wiki only mention read/write/get_info : http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide<https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.lustre.org_Lustre-5FMonitoring-5Fand-5FStatistics-5FGuide&d=DwMGaQ&c=KV_I7O14pmwRcmAVyJ1eg4Jwb8Y2JAxuL5YgMGHpjcQ&r=FTXmt89oLXmbXfP78w86-PxB1XdLYgxG8hEoAnZvCvs&m=UC1t7z9tgmxUE2FWaTFHFT_Y69z_VMH0dEYF1VXadX0&s=cdXTUStD_NPwj3GtNYBqJA2nkJ1Ec53F9aD5UxFo5tw&e=>
But the list I get is quite different :
    obdfilter.OST001.stats=
    snapshot_time             1563285450.647120173 secs.nsecs
    read_bytes                340177708 samples [bytes] 4096 4194304 396712660910080
    write_bytes               30008856 samples [bytes] 24 4194304 78618271501667
    setattr                   1755 samples [reqs]
    punch                     73463 samples [reqs]
    sync                      50606 samples [reqs]
    destroy                   31990 samples [reqs]
    create                    956 samples [reqs]
    statfs                    75378743 samples [reqs]
    connect                   5798 samples [reqs]
    reconnect                 3242 samples [reqs]
    disconnect                5820 samples [reqs]
    statfs                    3737980 samples [reqs]
    preprw                    370186566 samples [reqs]
    commitrw                  370186557 samples [reqs]
    ping                      882096292 samples [reqs]
For the MDT, most are pretty much self explanatory, but I'll still be happy to be pointed to some doc.
mdt.MDT0000.md_stats=
snapshot_time             1563287416.006001068 secs.nsecs
open                      3174644054 samples [reqs]
close                     3174494603 samples [reqs]
mknod                     107564 samples [reqs]
unlink                    99625 samples [reqs]
mkdir                     199643 samples [reqs]
rmdir                     45021 samples [reqs]
rename                    12728 samples [reqs]
getattr                   50227431 samples [reqs]
setattr                   103435 samples [reqs]
getxattr                  9051470 samples [reqs]
setxattr                  14 samples [reqs]
statfs                    7525513 samples [reqs]
sync                      20597 samples [reqs]
samedir_rename            207 samples [reqs]
crossdir_rename           12521 samples [reqs]
And anyone knows how to read the OST brw_stats ?
obdfilter.OST0014.brw_stats=
snapshot_time:         1563287631.511085465 (secs.nsecs)

                           read      |     write
pages per bulk r/w     rpcs  % cum % |  rpcs        % cum %
1:               231699298  66  66   | 180944   0   0
2:                  855611   0  67   | 322359   1   1
4:                  541749   0  67   | 5539716  18  20
8:                 1281219   0  67   | 67837   0  20
16:                 637808   0  67   | 114546   0  20
32:                1342813   0  68   | 3099780  10  31
64:                1559834   0  68   | 173166   0  31
128:               1583127   0  69   | 211512   0  32
256:              10627583   3  72   | 499978   1  34
512:               3909601   1  73   | 1029686   3  37
1K:               92141161  26 100   | 18788597  62 100

                           read      |     write
discontiguous pages    rpcs  % cum % |  rpcs        % cum %
0:               346179839 100 100   | 180946   0   0
1:                       0   0 100   | 322363   1   1
2:                       0   0 100   | 5521062  18  20
3:                       0   0 100   | 18650   0  20
4:                       0   0 100   | 18159   0  20
5:                       0   0 100   | 26664   0  20
6:                       0   0 100   | 10830   0  20
7:                       0   0 100   | 12189   0  20
8:                       0   0 100   | 11365   0  20
9:                       0   0 100   | 10253   0  20
10:                      0   0 100   | 8810   0  20
11:                      0   0 100   | 9825   0  20
12:                      0   0 100   | 16740   0  20
13:                      0   0 100   | 14421   0  20
14:                      0   0 100   | 10513   0  20
15:                      0   0 100   | 32655   0  20
16:                      0   0 100   | 1418677   4  25
17:                      0   0 100   | 1477077   4  30
18:                      0   0 100   | 6227   0  30
19:                      0   0 100   | 7071   0  30
20:                      0   0 100   | 7297   0  30
21:                      0   0 100   | 8478   0  30
22:                      0   0 100   | 34591   0  30
23:                      0   0 100   | 35591   0  30
24:                      0   0 100   | 8378   0  30
25:                      0   0 100   | 8724   0  30
26:                      0   0 100   | 52300   0  30
27:                      0   0 100   | 14038   0  30
28:                      0   0 100   | 4734   0  30
29:                      0   0 100   | 4878   0  31
30:                      0   0 100   | 6232   0  31
31:                      0   0 100   | 20708383  68 100
                           read      |     write
disk I/Os in flight    ios   % cum % |  ios         % cum %
1:               211177215  61  61   | 29305564  97  97
2:                41332944  11  72   | 498260   1  99
3:                22250410   6  79   | 86831   0  99
4:                15524737   4  83   | 34513   0  99
5:                12049717   3  87   | 19442   0  99
6:                 8904108   2  89   | 13107   0  99
7:                 5955503   1  91   | 8748   0  99
8:                 3943444   1  92   | 6869   0  99
9:                 3115034   0  93   | 5447   0  99
10:                2553941   0  94   | 4593   0  99
11:                2121217   0  95   | 3828   0  99
12:                1709040   0  95   | 3264   0  99
13:                1418541   0  95   | 2800   0  99
14:                1184247   0  96   | 2454   0  99
15:                1047397   0  96   | 2153   0  99
16:                 875229   0  96   | 1871   0  99
17:                 752555   0  97   | 1643   0  99
18:                 656424   0  97   | 1531   0  99
19:                 584066   0  97   | 1375   0  99
20:                 529630   0  97   | 1267   0  99
21:                 477143   0  97   | 1144   0  99
22:                 426303   0  97   | 1067   0  99
23:                 385707   0  97   |  984   0  99
24:                 354584   0  98   |  959   0  99
25:                 328332   0  98   |  899   0  99
26:                 305886   0  98   |  828   0  99
27:                 281444   0  98   |  786   0  99
28:                 261958   0  98   |  734   0  99
29:                 242335   0  98   |  711   0  99
30:                 227010   0  98   |  692   0  99
31:                5203738   1 100   | 13757   0 100

                           read      |     write
I/O time (1/1000s)     ios   % cum % |  ios         % cum %
1:                34363647  26  26   |    0   0   0
2:                 9013233   7  33   |    0   0   0
4:                 3381561   2  36   |    0   0   0
8:                 2194196   1  38   |    0   0   0
16:                8767687   6  45   |    0   0   0
32:               25062401  19  64   |    0   0   0
64:               27196704  21  85   |    0   0   0
128:              10760610   8  94   |    0   0   0
256:               4203334   3  97   |    0   0   0
512:               2002196   1  99   |    0   0   0
1K:                 785539   0  99   |    0   0   0
2K:                 340525   0  99   |    0   0   0
4K:                 140336   0  99   |    0   0   0
8K:                   6875   0  99   |    0   0   0
16K:                   161   0 100   |    0   0   0

                           read      |     write
disk I/O size          ios   % cum % |  ios         % cum %
8:                       4   0   0   |    0   0   0
16:                      0   0   0   |    0   0   0
32:                      1   0   0   |    4   0   0
64:                      1   0   0   | 5703   0   0
128:                  3061   0   0   | 2853   0   0
256:                     1   0   0   | 3340   0   0
512:                     1   0   0   |  309   0   0
1K:                      0   0   0   | 3697   0   0
2K:                      2   0   0   | 38311   0   0
4K:              231696225  66  66   | 126727   0   0
8K:                 855613   0  67   | 322359   1   1
16K:                541749   0  67   | 5539716  18  20
32K:               1281219   0  67   | 67837   0  20
64K:                637808   0  67   | 114546   0  20
128K:              1342813   0  68   | 3099780  10  31
256K:              1559834   0  68   | 173166   0  31
512K:              1583127   0  69   | 211512   0  32
1M:               10627583   3  72   | 499978   1  34
2M:                3909601   1  73   | 1029686   3  37
4M:               92141161  26 100   | 18788597  62 100
Last thing, is there any way to get the name of the filesystem an OST is part of by using lctl ?

Best regards,
Louis


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190716/3ce6f1ba/attachment-0001.html>


More information about the lustre-discuss mailing list