[Lustre-discuss] bad 1.6.3 striped write performance
Robin Humble
rjh+lustre at cita.utoronto.ca
Mon Nov 26 05:39:58 PST 2007
Hi,
I'm seeing what can only be described as dismal striped write
performance from lustre 1.6.3 clients :-/
1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
of days ago) are also terrible.
the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre
version on the servers doesn't matter - the problem is with the 1.6.3
and 1.6.4rc3 client kernels with striped writes (although un-striped
writes are a tad slower too).
with 1M lustre stripes:
client client dd write speed (MB/s)
OS kernel a) b) c) d)
1.6.2:
centos4.5 2.6.9-55.0.2.EL_lustre.1.6.2smp 202 270 118 117
centos5 2.6.18-8.1.8.el5_lustre.1.6.2rjh 166 190 117 119
1.6.3+:
centos4.5 2.6.9-55.0.9.EL_lustre.1.6.3smp 32 9 30 9
centos5 2.6.18-53.el5-lustre1.6.4rc3rjh 36 10 27 10
^^^^ ^^^^
yes, that is really 9MB/s. sigh
with no lustre stripes:
client client dd write speed (MB/s)
OS kernel a) c)
1.6.2:
centos4.5 2.6.9-55.0.2.EL_lustre.1.6.2smp 102 98
centos5 2.6.18-8.1.8.el5_lustre.1.6.2rjh 84 77
1.6.3+:
centos4.5 2.6.9-55.0.9.EL_lustre.1.6.3smp 94 95
centos5 2.6.18-53.el5-lustre1.6.4rc3rjh 73 67
a) servers centos5, 2.6.18-53.el5-lustre1.6.4rc3rjh, md raid5, fabric IB
b) servers centos4.5, 2.6.9-55.0.9.EL_lustre.1.6.3smp, "" , fabric IB
c) servers centos5, 2.6.18-8.1.14.el5_lustre.1.6.3smp, "" . fabric gigE
d) servers centos4.5, 2.6.9-55.0.9.EL_lustre.1.6.3smp, "" , fabric gigE
all runs have the same setup - two OSS's, each with a 16 FC disk md
raid5 OST clients with 512m ram, server with 8g, all x86_64, test is
dd if=/dev/zero of=/mnt/testfs/blah bs=1M count=5000
each test run >=2 times. there are no errors from lustre or kernels.
I can't see anything relevant in bugzilla.
is anyone else seeing this?
seems weird that 1.6.3 has been out there for a while and nobody else
has reported it, but I can't think or any more testing variants I can
try...
anyway, some more simple setup info:
% lfs getstripe /mnt/testfs/
OBDS:
0: testfs-OST0000_UUID ACTIVE
1: testfs-OST0001_UUID ACTIVE
/mnt/testfs/
default stripe_count: -1 stripe_size: 1048576 stripe_offset: -1
/mnt/testfs/blah
obdidx objid objid group
1 3 0x3 0
0 2 0x2 0
% lfs df
UUID 1K-blocks Used Available Use% Mounted on
testfs-MDT0000_UUID 1534832 306680 1228152 19% /mnt/testfs[MDT:0]
testfs-OST0000_UUID 15481840 3803284 11678556 24% /mnt/testfs[OST:0]
testfs-OST0001_UUID 15481840 3803284 11678556 24% /mnt/testfs[OST:1]
filesystem summary: 30963680 7606568 23357112 24% /mnt/testfs
cheers,
robin
ps. the 'rjh' series kernels are required 'cos lustre rhel5 kernels
don't have ko2iblnd support in them.
More information about the lustre-discuss
mailing list