<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
You might also want to look at the 20533 ticket to see if it is
related. There are kernel patches for RHEL 5.3 which improves the
Lustre performance on RAID, but there are no plans for CentOS kernel.<br>
<br>
<br>
Rafael David Tinoco wrote:
<blockquote cite="mid:011601ca325e$4f163310$ed429930$%25Tinoco@Sun.COM"
type="cite">
<pre wrap="">I think Ive discovered the problem.
I was using multipathd in my "raid" devices.
Getting arround 200MB/s in raid6 with 10 disks.
Now.. testing without the multipaths:
root@a02n00:~# mdadm --detail /dev/md20
/dev/md20:
Version : 00.90.03
Creation Time : Thu Sep 10 18:27:28 2009
Raid Level : raid6
Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 10
Preferred Minor : 20
Persistence : Superblock is persistent
Update Time : Thu Sep 10 18:27:28 2009
State : clean
Active Devices : 10
Working Devices : 10
Failed Devices : 0
Spare Devices : 0
Chunk Size : 128K
UUID : 9cf9dd02:d53bc608:62e867a4:1df781ca
Events : 0.1
Number Major Minor RaidDevice State
0 66 144 0 active sync /dev/sdap
1 66 160 1 active sync /dev/sdaq
2 66 176 2 active sync /dev/sdar
3 66 192 3 active sync /dev/sdas
4 66 208 4 active sync /dev/sdat
5 66 224 5 active sync /dev/sdau
6 66 240 6 active sync /dev/sdav
7 67 0 7 active sync /dev/sdaw
8 8 16 8 active sync /dev/sdb
9 8 112 9 active sync /dev/sdh
root@a02n00:~# dd if=/dev/zero of=/dev/md20 bs=1024k count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 21.0579 seconds, 498 MB/s
root@a02n00:~# dd if=/dev/zero of=/dev/md20 bs=1024k count=99999
99999+0 records in
99999+0 records out
104856551424 bytes (105 GB) copied, 221.137 seconds, 474 MB/s
Much better :D
So basically linux + mpt fusion + multipathd + mdadm not so good option for OST!!!
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:Hung-Sheng.Tsao@Sun.COM">Hung-Sheng.Tsao@Sun.COM</a> [<a class="moz-txt-link-freetext" href="mailto:Hung-Sheng.Tsao@Sun.COM">mailto:Hung-Sheng.Tsao@Sun.COM</a>]
Sent: Thursday, September 10, 2009 6:25 PM
To: Rafael David Tinoco
Subject: Re: [Lustre-discuss] OST - low MB/s
so what is the out put if U use 128k*8=bs?
Rafael David Tinoco wrote:
</pre>
<blockquote type="cite">
<pre wrap="">My journal device is:
root@a01n00:~# mdadm --detail /dev/md10
/dev/md10:
Version : 00.90.03
Creation Time : Thu Sep 10 17:49:07 2009
Raid Level : raid1
Array Size : 987840 (964.85 MiB 1011.55 MB)
Used Dev Size : 987840 (964.85 MiB 1011.55 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 10
Persistence : Superblock is persistent
Update Time : Thu Sep 10 17:49:07 2009
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : e48152dd:adb1c505:137aa99c:1b3eece4
Events : 0.1
Number Major Minor RaidDevice State
0 253 17 0 active sync /dev/dm-17
1 253 14 1 active sync /dev/dm-14
My OST device is:
root@a01n00:~# mdadm --detail /dev/md20
/dev/md20:
Version : 00.90.03
Creation Time : Thu Sep 10 17:49:23 2009
Raid Level : raid6
Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 10
Preferred Minor : 20
Persistence : Superblock is persistent
Update Time : Thu Sep 10 18:06:20 2009
State : clean
Active Devices : 10
Working Devices : 10
Failed Devices : 0
Spare Devices : 0
Chunk Size : 128K
UUID : b80fb16d:38c47a56:fdf2b5e9:9ff47af3
Events : 0.2
Number Major Minor RaidDevice State
0 253 11 0 active sync /dev/dm-11
1 253 12 1 active sync /dev/dm-12
2 253 13 2 active sync /dev/dm-13
3 253 15 3 active sync /dev/dm-15
4 253 16 4 active sync /dev/dm-16
5 253 18 5 active sync /dev/dm-18
6 253 19 6 active sync /dev/dm-19
7 253 20 7 active sync /dev/dm-20
8 253 1 8 active sync /dev/dm-1
9 253 21 9 active sync /dev/dm-21
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:Hung-Sheng.Tsao@Sun.COM">Hung-Sheng.Tsao@Sun.COM</a> [<a class="moz-txt-link-freetext" href="mailto:Hung-Sheng.Tsao@Sun.COM">mailto:Hung-Sheng.Tsao@Sun.COM</a>]
Sent: Thursday, September 10, 2009 6:19 PM
To: Rafael David Tinoco
Cc: <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
Subject: Re: [Lustre-discuss] OST - low MB/s
not sure I understand Ur setup
which one is the raid6 lun?
which are the individual HD?
Rafael David Tinoco wrote:
</pre>
<blockquote type="cite">
<pre wrap="">216MB/s using 8*128 (1024k) as bs. Too low for 8 active disks.. right ? Arround 27MB/s.. from 50MB/s in the "real" disk.
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss-bounces@lists.lustre.org">lustre-discuss-bounces@lists.lustre.org</a> [<a class="moz-txt-link-freetext" href="mailto:lustre-discuss-bounces@lists.lustre.org">mailto:lustre-discuss-bounces@lists.lustre.org</a>] On Behalf Of Dr. Hung-Sheng Tsao
(LaoTsao)
Sent: Thursday, September 10, 2009 5:50 PM
To: Rafael David Tinoco
Cc: <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
Subject: Re: [Lustre-discuss] OST - low MB/s
raid 6 chunk size=128k the full strip size will be 128k*8 (for 10 disks
8+2 raid 6)
in Ur dd test one should use bs=128k*8, then each 8 HDD will be busy
regards
Rafael David Tinoco wrote:
</pre>
<blockquote type="cite">
<pre wrap="">With this RAID5 configuration Im getting:
root@a02n00:~# dd if=/dev/zero of=/dev/md20 bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes (1.3 GB) copied, 5.20774 seconds, 252 MB/s
root@a02n00:~# dd if=/dev/zero of=/dev/md20 bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes (1.3 GB) copied, 5.12 seconds, 256 MB/s
So, 80MB/s using these md20 as OSTs isnt quite right .
*From:* <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss-bounces@lists.lustre.org">lustre-discuss-bounces@lists.lustre.org</a>
[<a class="moz-txt-link-freetext" href="mailto:lustre-discuss-bounces@lists.lustre.org">mailto:lustre-discuss-bounces@lists.lustre.org</a>] *On Behalf Of *Rafael
David Tinoco
*Sent:* Thursday, September 10, 2009 4:26 PM
*To:* <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
*Subject:* [Lustre-discuss] OST - low MB/s
Hello,
I'm having problems now with my "OSTs" throughput.
I have 4 OSS each one with 2 OSTs. These OSTs are RAID6 with 10 disks,
chunk size of 128k.
These disks are from J4400 (JBOD) connected in multipath using multipathd.
Each disk speed is giving me 50MB/s with dd.
With lustre, using IOR or DD I can get only arround 80MB/s. I was
expecting for 8 active disks in raid 8*50 = something between 300 and
400MB/s.
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 6.00 9.06 0.00 84.94
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
svctm %util
md10 0.00 0.00 0.00 398.00 0.00 1.55 8.00 0.00 0.00 0.00 0.00
md11 0.00 0.00 0.00 380.00 0.00 1.48 8.00 0.00 0.00 0.00 0.00
md20 0.00 0.00 0.00 158.00 0.00 79.00 1024.00 0.00 0.00 0.00 0.00
md21 0.00 0.00 0.00 159.00 0.00 79.50 1024.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 5.94 9.32 0.00 84.74
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
svctm %util
md10 0.00 0.00 0.00 407.50 0.00 1.59 8.00 0.00 0.00 0.00 0.00
md11 0.00 0.00 0.00 394.00 0.00 1.54 8.00 0.00 0.00 0.00 0.00
md20 0.00 0.00 0.00 159.00 0.00 79.50 1024.00 0.00 0.00 0.00 0.00
md21 0.00 0.00 0.00 158.00 0.00 79.00 1024.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 6.37 9.43 0.00 84.21
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await
svctm %util
md10 0.00 0.00 0.00 410.50 0.00 1.60 8.00 0.00 0.00 0.00 0.00
md11 0.00 0.00 0.00 376.00 0.00 1.47 8.00 0.00 0.00 0.00 0.00
md20 0.00 0.00 0.00 165.00 0.00 82.50 1024.00 0.00 0.00 0.00 0.00
md21 0.00 0.00 0.00 165.00 0.00 82.50 1024.00 0.00 0.00 0.00 0.00
Any clues ?
Rafael David Tinoco - Sun Microsystems
Systems Engineer - High Performance Computing
<a class="moz-txt-link-abbreviated" href="mailto:Rafael.Tinoco@Sun.COM">Rafael.Tinoco@Sun.COM</a> - 55.11.5187.2194
------------------------------------------------------------------------
_______________________________________________
Lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lustre-discuss@lists.lustre.org">Lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/mailman/listinfo/lustre-discuss">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a>
</pre>
</blockquote>
<pre wrap="">_______________________________________________
Lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lustre-discuss@lists.lustre.org">Lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/mailman/listinfo/lustre-discuss">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a>
</pre>
</blockquote>
<pre wrap="">
</pre>
</blockquote>
<pre wrap=""><!---->
_______________________________________________
Lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lustre-discuss@lists.lustre.org">Lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/mailman/listinfo/lustre-discuss">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a>
</pre>
</blockquote>
<br>
</body>
</html>