[Lustre-discuss] 1.8.1.1 write slow performance :/
Piotr Wadas
pwadas at dtpw.pl
Sun Nov 8 12:52:01 PST 2009
--
Linux aleft 2.6.27.29-0.1_lustre.1.8.1.1-default #1 SMP
drbd 8.3.5-(api:88/proto:86-91)
pacemaker 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe
Lustre 1.8.1.1-20091009080716-PRISTINE-2.6.27.29-0.1_lustre.1.8.1.1-default
Well, I'v setup everything using 64bit kernel, for now I got
~4 TB of usable space with one lustre fs ost volume.
I just did some speed tests between client and filesystem server,
with dedicated GbitEthernet connection, I compared uploading via
lustre-mounted share, and uploading to the same share, mounted
as loopback lustre client on filesystem server and reexported via nfs.
Results are quite sad, I have dreadfully slow write to remote lustrefs
directly, while write to lustre fs reexported via nfs is at least 10 times
faster.. Client machine is Xeon 2.4 with 4GB RAM, and server machine is
Xeon 3.0Gh with 8GB ram. I reviewed tuning chapter from lustre manual,
tuned rx of ethernet interface with ethtool.
Lustre volumes (mgs,mdt,ost) are set up on UpToDate (synchronized) drbd
resources (synchronization already finished, via dedicated 1Gbit link,
not the same interface used to communicate with lustre clients.)
I'd blame drbd for this, well, some cost is expected with drbd,
but nfs-reexported locally-mounte lfs volume obviously goes through
drbd stack too! DRBD resource is setup as backend storage device for
lustre, so actually it's not possible to write or read anything from/to
lustre with skipping drbd stack. Machines are load-free.
Seems, that with client-initiated write, the way
lustre client => lustre server => drbd resource "X"
is dramatically slower than
nfs clinet => nfs server => loopback lustre server => drbd resource "X".
And this is definitely not expected. Below are example transfer rates.
Any ideas for this? Is this, for example, some difference between nfs
and lustre for in-the-middle gigabit switch performance ?
aleft:~# free -m
total used free shared buffers cached
Mem: 7987 3861 4126 0 102 3475
-/+ buffers/cache: 282 7705
Swap: 1906 0 1906
aleft:~# logout
Connection to master closed.
b02:~# free -m
total used free shared buffers cached
Mem: 4054 3908 145 0 43 1813
-/+ buffers/cache: 2051 2002
Swap: 7812 0 7812
b02:~#
b02:~# ssh root at master
[..]
aleft:~# mount -t lustre
/dev/drbd0 on /mnt/mgs type lustre (rw,noauto)
/dev/drbd1 on /mnt/mdt type lustre (rw,noauto,_netdev)
/dev/drbd2 on /mnt/ost01 type lustre (rw,noauto,_netdev)
master at tcp0:/lfs00 on /mnt/lfs00 type lustre (rw,noauto,_netdev)
aleft:~# logout
b02:~# mount -t lustre
master at tcp0:/lfs00 on /mnt/lfs00 type lustre (rw,noauto,_netdev)
b02:~# mount -t nfs |grep master
master:/mnt/lfs00 on /mnt/nfs00 type nfs (rw,addr=192.168.0.100)
b02:~#
Connection to master closed.
b02:~# ./100mb.sh
lfs00-send
time dd if=/dev/zero of=/mnt/lfs00/testfile-b02 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 22.3427 s, 4.7 MB/s
real 0m22.345s
user 0m0.100s
sys 0m3.760s
lfs00-get
time dd of=testfile-b02 if=/mnt/lfs00/testfile-b02 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 0.987265 s, 106 MB/s
real 0m0.989s
user 0m0.040s
sys 0m0.880s
b02:~# ./100mb-nfs.sh
nfs00-send
time dd if=/dev/zero of=/mnt/nfs00/testfile-b02 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 1.05942 s, 99.0 MB/s
real 0m1.061s
user 0m0.028s
sys 0m0.252s
nfs00-get
time dd of=testfile-b02 if=/mnt/nfs00/testfile-b02 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 0.576351 s, 182 MB/s
real 0m0.578s
user 0m0.016s
sys 0m0.556s
b02:~#
More information about the lustre-discuss
mailing list