[Lustre-discuss] 1GB throughput limit on OST (1.8.5)?
pillai at mathstat.dal.ca
Thu Jan 27 05:48:43 PST 2011
I guess you have two gigabit nics bonded in mode 6 and not two 1GB nics?
(B-Bytes, b-bits) The max aggregate throughput could be about 200MBps
out of the 2 bonded nics. I think the mode 0 bonding works only with
cisco etherchannel or something similar on the switch side. Same with
the FC connection, its 4Gbps (not 4GBps) or about 400-500 MBps max
throughout. Maybe you could also see the max read and write capabilities
of the raid controller other than just the network. When testing with
dd, some of the data remains as dirty data till its flushed into the
disk. I think the default background ratio is 10% for rhel5 which would
be sizable if your oss have lots of ram. There is chance of lockup of
the oss once it hits the dirty_ratio limit,which is 40% by default. So a
bit more aggressive flush to disk by lowering the background_ratio and a
bit more headroom before it hits the dirty_ratio is generally desirable
if your raid controller could keep up with it. So with your current
setup, i guess you could get a max of 400MBps out of both OSS's if they
both have two 1Gb nics in them. Maybe if you have one of the switches
from Dell that has 4 10Gb ports in them (their powerconnect 6248), 10Gb
nics for your OSS's might be a cheaper way to increase the aggregate
performance. I think over 1GBps from a client is possible in cases where
you use infiniband and rdma to deliver data.
David Merhar wrote:
> Our OSS's with 2x1GB NICs (bonded) appear limited to 1GB worth of
> write throughput each.
> Our setup:
> 2 OSS serving 1 OST each
> Lustre 1.8.5
> RHEL 5.4
> New Dell M610's blade servers with plenty of CPU and RAM
> All SAN fibre connections are at least 4GB
> Some notes:
> - A direct write (dd) from a single OSS to the OST gets 4GB, the OSS's
> fibre wire speed.
> - A single client will get 2GB of lustre write speed, the client's
> ethernet wire speed.
> - We've tried bond mode 6 and 0 on all systems. With mode 6 we will
> see both NICs on both OSSs receiving data.
> - We've tried multiple OSTs per OSS.
> But 2 clients writing a file will get 2GB of total bandwidth to the
> filesystems. We have been unable to isolate any particular resource
> bottleneck. None of the systems (MDS, OSS, or client) seem to be
> working very hard.
> The 1GB per OSS threshold is so consistent, that it almost appears by
> design - and hopefully we're missing something obvious.
> Any advice?
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
More information about the lustre-discuss