[Lustre-discuss] iozone slow read for 64k record size 2.4 vs. 1.8.9
JS Landry
jean-sebastien.landry at calculquebec.ca
Tue Sep 3 12:50:12 PDT 2013
Hi, thanks for the patch, the error is still present in collectl 3.6.7.
JS
On 30/08/13 02:59 AM, Grégoire Pichon wrote:
> Hi,
>
> I found a small bug in collectl (I am using version V3.6.3-2) when it harvests read bandwidth.
> Maybe you are facing the same issue.
>
> Here is the patch
> $ diff -Nraup ~/bin/collectl/collectl.orig ~/bin/collectl/collectl
> --- /home_nfs/pichong/bin/collectl/collectl.orig 2012-10-23 17:53:22.000000000 +0200
> +++ /home_nfs/pichong/bin/collectl/collectl 2012-10-23 17:53:27.000000000 +0200
> @@ -3754,7 +3754,7 @@ sub getProc
> elsif ($type==11)
> {
> if ($line=~/^dirty/) { record(2, "$tag $line"); next; }
> - if ($line=~/^read/) { record(2, "$tag $line"); next; }
> + if ($line=~/^read_/) { record(2, "$tag $line"); next; }
> if ($line=~/^write_/) { record(2, "$tag $line"); next; }
> if ($line=~/^open/) { record(2, "$tag $line"); next; }
> if ($line=~/^close/) { record(2, "$tag $line"); next; }
>
> The stats file (/proc/fs/lustre/llite/fs-*/stats) contains 2 lines starting with "read" string.
> # grep "^read" /proc/fs/lustre/llite/fs1-ffff880472973400/stats
> read_bytes 3402752 samples [bytes] 1048576 16777216 4037269258240
> readdir 240 samples [regs]
>
>
> Regards,
> Grégoire.
> --
> Grégoire PICHON
> Parallel File Systems Engineer
> Bull Extreme Computing R&D
>
>
> -----Message d'origine-----
> De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de JS Landry
> Envoyé : vendredi 30 août 2013 02:19
> À : lustre-discuss at lists.lustre.org
> Objet : Re: [Lustre-discuss] iozone slow read for 64k record size 2.4 vs. 1.8.9
>
>
> On 29/08/13 07:03 PM, JS Landry wrote:
>> Hi,I'm testing lustre 2.4 with iozone and I can't find why the read of
>> 64k record size (1GB file) is so slow compare to 1.8.9client.
>>
>> 1.8.9 client
>> KB reclen write rewrite read reread
>> 1048576 64 677521 794456 6130161 6204552
>> 1048576 1024 709112 862278 7165733 7152088
>>
>> 2.4.0 client
>> KB reclen write rewrite read reread
>> 1048576 64 682344 897808 2334044 2331080
>> 1048576 1024 868466 1217273 4599784 4610098
>>
>
> I run collectl -scml while running iozone, and I don't know what is
> going on with the "lustre KBRead/Reads"
> stuck at 4G on the 2.4.0 client.(the KBRead/Reads columns returned to 0
> when I unmount lustre.)
> collectl works ok on 1.8.9. (same os, same hardware)
>
>
> 2.6.32-358.6.2.el6.x86_64
> collectl-3.6.7-1.el6.noarch
> lustre: 2.4.0
> kernel: patchless_client
> build: 2.4.0-RC2-gd3f91c4-PRISTINE-2.6.32-358.6.2.el6.x86_64
>
>
>
> #<--------CPU--------><-----------Memory-----------><--------Lustre
> Client-------->
> #cpu sys inter ctxsw Free Buff Cach Inac Slab Map KBRead Reads
> KBWrite Writes
> 6 6 2687 6132 20G 0 2G 171M 153M 322M 4096M 4G
> 175232 2738
> 21 21 5939 16537 19G 0 3G 812M 329M 322M 4096M 4G
> 656704 10261
> 6 6 2380 5751 19G 0 3G 1G 386M 322M 4096M 4G
> 216640 3385
> 3 3 1329 2773 19G 0 3G 915M 384M 322M 4096M 4G
> 111168 1737
> 24 24 6678 21347 19G 0 3G 74M 389M 322M 4096M 4G
> 861376 13459
> 2 2 1107 2082 19G 0 3G 272K 390M 322M 4096M 4G
> 76032 1188
> 0 0 500 141 19G 0 3G 272K 390M 322M 4096M 4G
> 0 0
> 6 6 1253 161 19G 0 3G 272K 390M 322M 4096M 4G
> 0 0
> 5 5 1132 139 19G 0 3G 272K 390M 322M 4096M 4G
> 0 0
> 5 5 987 360 19G 0 2G 272K 206M 322M 4096M 4G
> 0 0
> 2 2 1031 353 19G 0 2G 272K 115M 322M 4096M 4G
> 0 0
> 12 12 3490 10766 19G 0 2G 417M 220M 322M 4096M 4G
> 427008 417
> 16 16 4741 15475 19G 0 3G 1G 383M 322M 4096M 4G
> 621568 607
> 0 0 537 155 19G 0 3G 1G 384M 322M 4096M 4G
> 0 0
> 24 24 7451 24128 19G 0 3G 440K 386M 322M 4096M 4G
> 1048576 1024
> 0 0 640 176 19G 0 3G 272K 387M 322M 4096M 4G
> 0 0
> 2 2 752 133 19G 0 3G 272K 387M 322M 4096M 4G
> 0 0
>
>
> on the 1.8.9 client
>
> 2.6.32-358.6.2.el6.x86_64
> collectl-3.6.7-1.el6.noarch
> lustre: 1.8.9
> kernel: patchless_client
> build: jenkins-wc1--PRISTINE-2.6.32-358.6.2.el6.x86_64
>
>
> #<--------CPU--------><-----------Memory-----------><--------Lustre
> Client-------->
> #cpu sys inter ctxsw Free Buff Cach Inac Slab Map KBRead Reads
> KBWrite Writes
> 3 3 1585 4267 19G 0 2G 824M 318M 231M 0 0
> 205824 201
> 14 14 5653 17881 19G 0 2G 1M 323M 231M 0 0
> 842752 823
> 0 0 745 181 19G 0 2G 1M 324M 231M 0 0
> 0 0
> 2 2 645 112 19G 0 2G 1M 323M 231M 1048580
> 1025 0 0
> 2 2 724 124 19G 0 2G 1M 323M 231M 1048580
> 1025 0 0
> 12 11 2044 859 20G 0 1G 1M 223M 233M 1 3
> 0 0
> 14 14 4807 13121 20G 0 2G 644M 282M 230M 0 0 658112
> 10283
> 8 8 3245 8196 19G 0 2G 1G 320M 230M 0 0 390464
> 6101
> 0 0 570 164 19G 0 2G 1G 320M 230M 0 0
> 0 0
> 11 11 4061 12627 19G 0 2G 421M 317M 230M 0 0 618496
> 9664
> 8 8 2953 9239 19G 0 2G 1M 321M 230M 0 0 430080
> 6720
> 2 2 699 127 19G 0 2G 1M 321M 230M 1048580
> 16K 0 0
> 0 0 576 148 19G 0 2G 1M 321M 230M 0 0
> 0 0
> 3 3 783 210 19G 0 2G 1M 282M 230M 1048580
> 16K 0 0
> 2 2 725 597 20G 0 1G 21M 219M 231M 0 0
> 20480 20
> 14 14 5253 14347 20G 0 2G 714M 287M 231M 0 0
> 709632 693
>
>
> Is this a known bug?
> JS
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Jean-Sébastien Landry
Calcul Québec, Université Laval
Jean-Sebastien.Landry at calculquebec.ca
More information about the lustre-discuss
mailing list