[Lustre-discuss] OST - low MB/s
Rafael David Tinoco
Rafael.Tinoco at Sun.COM
Thu Sep 10 14:33:14 PDT 2009
I think Ive discovered the problem.
I was using multipathd in my "raid" devices.
Getting arround 200MB/s in raid6 with 10 disks.
Now.. testing without the multipaths:
root at a02n00:~# mdadm --detail /dev/md20
/dev/md20:
Version : 00.90.03
Creation Time : Thu Sep 10 18:27:28 2009
Raid Level : raid6
Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 10
Preferred Minor : 20
Persistence : Superblock is persistent
Update Time : Thu Sep 10 18:27:28 2009
State : clean
Active Devices : 10
Working Devices : 10
Failed Devices : 0
Spare Devices : 0
Chunk Size : 128K
UUID : 9cf9dd02:d53bc608:62e867a4:1df781ca
Events : 0.1
Number Major Minor RaidDevice State
0 66 144 0 active sync /dev/sdap
1 66 160 1 active sync /dev/sdaq
2 66 176 2 active sync /dev/sdar
3 66 192 3 active sync /dev/sdas
4 66 208 4 active sync /dev/sdat
5 66 224 5 active sync /dev/sdau
6 66 240 6 active sync /dev/sdav
7 67 0 7 active sync /dev/sdaw
8 8 16 8 active sync /dev/sdb
9 8 112 9 active sync /dev/sdh
root at a02n00:~# dd if=/dev/zero of=/dev/md20 bs=1024k count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 21.0579 seconds, 498 MB/s
root at a02n00:~# dd if=/dev/zero of=/dev/md20 bs=1024k count=99999
99999+0 records in
99999+0 records out
104856551424 bytes (105 GB) copied, 221.137 seconds, 474 MB/s
Much better :D
So basically linux + mpt fusion + multipathd + mdadm not so good option for OST!!!
-----Original Message-----
From: Hung-Sheng.Tsao at Sun.COM [mailto:Hung-Sheng.Tsao at Sun.COM]
Sent: Thursday, September 10, 2009 6:25 PM
To: Rafael David Tinoco
Subject: Re: [Lustre-discuss] OST - low MB/s
so what is the out put if U use 128k*8=bs?
Rafael David Tinoco wrote:
> My journal device is:
>
> root at a01n00:~# mdadm --detail /dev/md10
> /dev/md10:
> Version : 00.90.03
> Creation Time : Thu Sep 10 17:49:07 2009
> Raid Level : raid1
> Array Size : 987840 (964.85 MiB 1011.55 MB)
> Used Dev Size : 987840 (964.85 MiB 1011.55 MB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 10
> Persistence : Superblock is persistent
>
> Update Time : Thu Sep 10 17:49:07 2009
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> UUID : e48152dd:adb1c505:137aa99c:1b3eece4
> Events : 0.1
>
> Number Major Minor RaidDevice State
> 0 253 17 0 active sync /dev/dm-17
> 1 253 14 1 active sync /dev/dm-14
>
> My OST device is:
>
> root at a01n00:~# mdadm --detail /dev/md20
> /dev/md20:
> Version : 00.90.03
> Creation Time : Thu Sep 10 17:49:23 2009
> Raid Level : raid6
> Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
> Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
> Raid Devices : 10
> Total Devices : 10
> Preferred Minor : 20
> Persistence : Superblock is persistent
>
> Update Time : Thu Sep 10 18:06:20 2009
> State : clean
> Active Devices : 10
> Working Devices : 10
> Failed Devices : 0
> Spare Devices : 0
>
> Chunk Size : 128K
>
> UUID : b80fb16d:38c47a56:fdf2b5e9:9ff47af3
> Events : 0.2
>
> Number Major Minor RaidDevice State
> 0 253 11 0 active sync /dev/dm-11
> 1 253 12 1 active sync /dev/dm-12
> 2 253 13 2 active sync /dev/dm-13
> 3 253 15 3 active sync /dev/dm-15
> 4 253 16 4 active sync /dev/dm-16
> 5 253 18 5 active sync /dev/dm-18
> 6 253 19 6 active sync /dev/dm-19
> 7 253 20 7 active sync /dev/dm-20
> 8 253 1 8 active sync /dev/dm-1
> 9 253 21 9 active sync /dev/dm-21
>
> -----Original Message-----
> From: Hung-Sheng.Tsao at Sun.COM [mailto:Hung-Sheng.Tsao at Sun.COM]
> Sent: Thursday, September 10, 2009 6:19 PM
> To: Rafael David Tinoco
> Cc: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] OST - low MB/s
>
> not sure I understand Ur setup
> which one is the raid6 lun?
> which are the individual HD?
>
>
> Rafael David Tinoco wrote:
>
>> 216MB/s using 8*128 (1024k) as bs. Too low for 8 active disks.. right ? Arround 27MB/s.. from 50MB/s in the "real" disk.
>>
>> -----Original Message-----
>> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Dr. Hung-Sheng Tsao
>> (LaoTsao)
>> Sent: Thursday, September 10, 2009 5:50 PM
>> To: Rafael David Tinoco
>> Cc: lustre-discuss at lists.lustre.org
>> Subject: Re: [Lustre-discuss] OST - low MB/s
>>
>> raid 6 chunk size=128k the full strip size will be 128k*8 (for 10 disks
>> 8+2 raid 6)
>> in Ur dd test one should use bs=128k*8, then each 8 HDD will be busy
>> regards
>>
>>
>> Rafael David Tinoco wrote:
>>
>>
>>> With this RAID5 configuration Im getting:
>>>
>>> root at a02n00:~# dd if=/dev/zero of=/dev/md20 bs=128k count=10000
>>>
>>> 10000+0 records in
>>>
>>> 10000+0 records out
>>>
>>> 1310720000 bytes (1.3 GB) copied, 5.20774 seconds, 252 MB/s
>>>
>>> root at a02n00:~# dd if=/dev/zero of=/dev/md20 bs=128k count=10000
>>>
>>> 10000+0 records in
>>>
>>> 10000+0 records out
>>>
>>> 1310720000 bytes (1.3 GB) copied, 5.12 seconds, 256 MB/s
>>>
>>> So, 80MB/s using these md20 as OSTs isnt quite right .
>>>
>>> *From:* lustre-discuss-bounces at lists.lustre.org
>>> [mailto:lustre-discuss-bounces at lists.lustre.org] *On Behalf Of *Rafael
>>> David Tinoco
>>> *Sent:* Thursday, September 10, 2009 4:26 PM
>>> *To:* lustre-discuss at lists.lustre.org
>>> *Subject:* [Lustre-discuss] OST - low MB/s
>>>
>>> Hello,
>>>
>>> I'm having problems now with my "OSTs" throughput.
>>>
>>> I have 4 OSS each one with 2 OSTs. These OSTs are RAID6 with 10 disks,
>>> chunk size of 128k.
>>>
>>> These disks are from J4400 (JBOD) connected in multipath using multipathd.
>>>
>>> Each disk speed is giving me 50MB/s with dd.
>>>
>>> With lustre, using IOR or DD I can get only arround 80MB/s. I was
>>> expecting for 8 active disks in raid 8*50 = something between 300 and
>>> 400MB/s.
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>>
>>> 0.00 0.00 6.00 9.06 0.00 84.94
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
>>> svctm %util
>>>
>>> md10 0.00 0.00 0.00 398.00 0.00 1.55 8.00 0.00 0.00 0.00 0.00
>>>
>>> md11 0.00 0.00 0.00 380.00 0.00 1.48 8.00 0.00 0.00 0.00 0.00
>>>
>>> md20 0.00 0.00 0.00 158.00 0.00 79.00 1024.00 0.00 0.00 0.00 0.00
>>>
>>> md21 0.00 0.00 0.00 159.00 0.00 79.50 1024.00 0.00 0.00 0.00 0.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>>
>>> 0.00 0.00 5.94 9.32 0.00 84.74
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
>>> svctm %util
>>>
>>> md10 0.00 0.00 0.00 407.50 0.00 1.59 8.00 0.00 0.00 0.00 0.00
>>>
>>> md11 0.00 0.00 0.00 394.00 0.00 1.54 8.00 0.00 0.00 0.00 0.00
>>>
>>> md20 0.00 0.00 0.00 159.00 0.00 79.50 1024.00 0.00 0.00 0.00 0.00
>>>
>>> md21 0.00 0.00 0.00 158.00 0.00 79.00 1024.00 0.00 0.00 0.00 0.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>>
>>> 0.00 0.00 6.37 9.43 0.00 84.21
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
>>> svctm %util
>>>
>>> md10 0.00 0.00 0.00 410.50 0.00 1.60 8.00 0.00 0.00 0.00 0.00
>>>
>>> md11 0.00 0.00 0.00 376.00 0.00 1.47 8.00 0.00 0.00 0.00 0.00
>>>
>>> md20 0.00 0.00 0.00 165.00 0.00 82.50 1024.00 0.00 0.00 0.00 0.00
>>>
>>> md21 0.00 0.00 0.00 165.00 0.00 82.50 1024.00 0.00 0.00 0.00 0.00
>>>
>>> Any clues ?
>>>
>>> Rafael David Tinoco - Sun Microsystems
>>>
>>> Systems Engineer - High Performance Computing
>>>
>>> Rafael.Tinoco at Sun.COM - 55.11.5187.2194
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>
>
More information about the lustre-discuss
mailing list