[Lustre-discuss] Fwd: Lustre performance issue (obdfilter_survey
lior amar
liororama at gmail.com
Wed Jul 6 13:04:21 PDT 2011
Hi,
I am installing a Lustre system and I wanted to measure the OSS
performance.
I used the obdfilter_survey and got very low performance for low
thread numbers when using the case=network option
System Configuration:
* Lustre 1.8.6-wc (compiled from the whamcloud git)
* Centos 5.6
* Infiniband (mellanox cards) open ib from centos 5.6
* OSS - 2 quad core E5620 CPUS
* OSS - memory 48GB
* LSI 2965 raid card with 18 disks in raid 6 (16 data + 2). Raw
performance are good both when testing the block device or over a file
system with Bonnie++
* OSS uses ext4 and mkfs parameters were set to reflect the stripe
size .. -E stride =...
The performance test I did:
1) obdfilter_survey case=disk -
OSS performance is ok (similar to raw disk performance) -
In the case of 1 thread and one object getting 966MB/sec
2) obdfilter_survey case=network -
OSS performance is bad for low thread numbers and get better as
the number of threads increases.
For the 1 thread one object getting 88MB/sec
3) obdfilter_survey case=netdisk -- Same as network case
4) When running ost_survey I am getting also low performance:
Read = 156 MB/sec Write = ~350MB/sec
5) Running the lnet_self test I get much higher numbers
Numbers obtained with concurrency = 1
[LNet Rates of servers]
[R] Avg: 3556 RPC/s Min: 3556 RPC/s Max: 3556 RPC/s
[W] Avg: 4742 RPC/s Min: 4742 RPC/s Max: 4742 RPC/s
[LNet Bandwidth of servers]
[R] Avg: 1185.72 MB/s Min: 1185.72 MB/s Max: 1185.72 MB/s
[W] Avg: 1185.72 MB/s Min: 1185.72 MB/s Max: 1185.72 MB/s
Any Ideas why a single thread over network obtain 88MB/sec while the same
test conducted local obtained 966MB/sec??
What else should I test/read/try ??
10x
Below are the actual numbers:
===== obdfilter_survey case = disk ======
Wed Jul 6 13:24:57 IDT 2011 Obdfilter-survey for case=disk from oss1
ost 1 sz 16777216K rsz 1024K obj 1 thr 1 write 966.90
[ 644.40,1030.02] rewrite 1286.23 [1300.78,1315.77] read
8474.33 SHORT
ost 1 sz 16777216K rsz 1024K obj 1 thr 2 write 1577.95
[1533.57,1681.43] rewrite 1548.29 [1244.83,1718.42] read
11003.26 SHORT
ost 1 sz 16777216K rsz 1024K obj 1 thr 4 write 1465.68
[1354.73,1600.50] rewrite 1484.98 [1271.54,1584.52] read
16464.13 SHORT
ost 1 sz 16777216K rsz 1024K obj 1 thr 8 write 1267.39
[ 797.25,1476.48] rewrite 1350.28 [1283.80,1387.70] read
15353.69 SHORT
ost 1 sz 16777216K rsz 1024K obj 1 thr 16 write 1295.35
[1266.82,1408.70] rewrite 1332.59 [1315.61,1429.66] read
15001.67 SHORT
ost 1 sz 16777216K rsz 1024K obj 2 thr 2 write 1467.80
[1472.62,1691.42] rewrite 1218.88 [ 821.23,1338.74] read
13538.41 SHORT
ost 1 sz 16777216K rsz 1024K obj 2 thr 4 write 1561.09
[1521.57,1682.75] rewrite 1183.31 [ 959.10,1372.52] read
15955.31 SHORT
ost 1 sz 16777216K rsz 1024K obj 2 thr 8 write 1498.74
[1543.58,1704.41] rewrite 1116.19 [1001.06,1163.91] read
15523.22 SHORT
ost 1 sz 16777216K rsz 1024K obj 2 thr 16 write 1462.54
[ 985.08,1615.48] rewrite 1244.29 [1100.97,1444.80] read
15174.56 SHORT
ost 1 sz 16777216K rsz 1024K obj 4 thr 4 write 1483.42
[1497.88,1648.45] rewrite 1042.92 [ 801.25,1192.69] read
15997.30 SHORT
ost 1 sz 16777216K rsz 1024K obj 4 thr 8 write 1494.63
[1458.85,1624.13] rewrite 1041.81 [ 806.25,1183.89] read
15450.18 SHORT
ost 1 sz 16777216K rsz 1024K obj 4 thr 16 write 1469.96
[1450.65,1647.45] rewrite 1027.06 [ 645.50,1215.86] read
15543.46 SHORT
ost 1 sz 16777216K rsz 1024K obj 8 thr 8 write 1417.93
[1250.85,1520.58] rewrite 1007.45 [ 905.15,1130.82] read
15789.66 SHORT
ost 1 sz 16777216K rsz 1024K obj 8 thr 16 write 1324.28
[ 951.87,1518.26] rewrite 986.48 [ 855.21,1079.99] read
15510.70 SHORT
ost 1 sz 16777216K rsz 1024K obj 16 thr 16 write 1237.22
[ 989.07,1345.17] rewrite 915.56 [ 749.08,1033.03] read
15415.75 SHORT
==============================
====== obdfilter_survey case = network ========================
Wed Jul 6 16:29:38 IDT 2011 Obdfilter-survey for case=network from
oss6
ost 1 sz 16777216K rsz 1024K obj 1 thr 1 write 87.99
[ 86.92, 88.92] rewrite 87.98 [ 86.83, 88.92] read 88.09
[ 86.92, 88.92]
ost 1 sz 16777216K rsz 1024K obj 1 thr 2 write 175.76
[ 173.84, 176.83] rewrite 175.75 [ 174.84, 176.83] read 172.76
[ 171.67, 174.84]
ost 1 sz 16777216K rsz 1024K obj 1 thr 4 write 343.13
[ 327.69, 347.67] rewrite 344.64 [ 342.34, 347.67] read 331.20
[ 327.69, 337.77]
ost 1 sz 16777216K rsz 1024K obj 1 thr 8 write 638.44
[ 638.10, 653.39] rewrite 639.07 [ 627.75, 654.74] read 605.36
[ 598.84, 626.71]
ost 1 sz 16777216K rsz 1024K obj 1 thr 16 write 1257.67
[1216.88,1424.42] rewrite 1231.61 [1200.67,1316.77] read 1122.70
[1095.04,1187.64]
ost 1 sz 16777216K rsz 1024K obj 2 thr 2 write 175.69
[ 174.49, 176.83] rewrite 175.82 [ 174.79, 176.83] read 172.06
[ 169.67, 173.84]
ost 1 sz 16777216K rsz 1024K obj 2 thr 4 write 345.38
[ 343.68, 348.67] rewrite 344.40 [ 342.66, 348.32] read 331.19
[ 328.62, 337.68]
ost 1 sz 16777216K rsz 1024K obj 2 thr 8 write 638.29
[ 625.16, 676.37] rewrite 632.57 [ 619.43, 672.38] read 604.72
[ 601.69, 625.41]
ost 1 sz 16777216K rsz 1024K obj 2 thr 16 write 1247.19
[1212.38,1377.73] rewrite 1265.31 [1220.56,1396.71] read 1127.87
[1099.97,1187.67]
ost 1 sz 16777216K rsz 1024K obj 4 thr 4 write 343.96
[ 341.68, 347.67] rewrite 337.98 [ 324.70, 348.67] read 332.27
[ 327.69, 337.68]
ost 1 sz 16777216K rsz 1024K obj 4 thr 8 write 637.15
[ 626.89, 673.38] rewrite 636.47 [ 624.42, 675.37] read 605.98
[ 604.43, 620.64]
ost 1 sz 16777216K rsz 1024K obj 4 thr 16 write 1260.31
[1198.30,1419.70] rewrite 1289.95 [1235.05,1486.35] read 1119.08
[1081.16,1159.77]
ost 1 sz 16777216K rsz 1024K obj 8 thr 8 write 636.82
[ 628.41, 678.37] rewrite 634.36 [ 622.41, 671.38] read 607.59
[ 601.23, 627.79]
ost 1 sz 16777216K rsz 1024K obj 8 thr 16 write 1257.81
[1207.65,1405.00] rewrite 1267.45 [1233.43,1372.72] read 1125.58
[1114.65,1163.67]
ost 1 sz 16777216K rsz 1024K obj 16 thr 16 write 1247.34
[1215.70,1418.69] rewrite 1249.45 [1194.92,1372.73] read 1118.77
[1082.07,1171.94]
============================
======= OST Survey ==========
ost-survey -s 10000
Worst Read OST indx: 0 speed: 156.223264
Best Read OST indx: 4 speed: 172.706590
Read Average: 163.681117 +/- 5.299526 MB/s
Worst Write OST indx: 4 speed: 307.893338
Best Write OST indx: 2 speed: 370.923486
Write Average: 346.664793 +/- 20.849197 MB/s
Ost# Read(MB/s) Write(MB/s) Read-time Write-time
----------------------------------------------------
0 156.223 354.215 64.011 28.231
1 164.394 349.652 60.830 28.600
2 162.195 370.923 61.654 26.960
3 162.887 350.640 61.392 28.519
4 172.707 307.893 57.902 32.479
10x
--lior
--
----------------------oo--o(:-:)o--oo----------------
Lior Amar, Ph.D.
Cluster Logic Ltd --> The Art of HPC
www.clusterlogic.net
----------------------------------------------------------
--
----------------------oo--o(:-:)o--oo----------------
Lior Amar, Ph.D.
Cluster Logic Ltd --> The Art of HPC
www.clusterlogic.net
----------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110706/2b4ae202/attachment.htm>
More information about the lustre-discuss
mailing list