[Lustre-discuss] 1GB throughput limit on OST (1.8.5)?

David Merhar merhar at arlut.utexas.edu
Thu Jan 27 06:12:53 PST 2011

Sorry - little b all the way around.

We're limited to 1Gb per OST.


On Jan 27, 2011, at 7:48 AM, Balagopal Pillai wrote:

> I guess you have two gigabit nics bonded in mode 6 and not two 1GB  
> nics?
> (B-Bytes, b-bits) The max aggregate throughput could be about 200MBps
> out of the 2 bonded nics. I think the mode 0 bonding works only with
> cisco etherchannel or something similar on the switch side. Same with
> the FC connection, its 4Gbps (not 4GBps) or about 400-500 MBps max
> throughout. Maybe you could also see the max read and write  
> capabilities
> of the raid controller other than just the network. When testing with
> dd, some of the data remains as dirty data till its flushed into the
> disk. I think the default background ratio is 10% for rhel5 which  
> would
> be sizable if your oss have lots of ram. There is chance of lockup of
> the oss once it hits the dirty_ratio limit,which is 40% by default.  
> So a
> bit more aggressive flush to disk by lowering the background_ratio  
> and a
> bit more headroom before it hits the dirty_ratio is generally  
> desirable
> if your raid controller could keep up with it. So with your current
> setup, i guess you could get a max of 400MBps out of both OSS's if  
> they
> both have two 1Gb nics in them. Maybe if you have one of the switches
> from Dell that has 4 10Gb ports in them (their powerconnect 6248),  
> 10Gb
> nics for your OSS's might be a cheaper way to increase the aggregate
> performance. I think over 1GBps from a client is possible in cases  
> where
> you use infiniband and rdma to deliver data.
> David Merhar wrote:
>> Our OSS's with 2x1GB NICs (bonded) appear limited to 1GB worth of
>> write throughput each.
>> Our setup:
>> 2 OSS serving 1 OST each
>> Lustre 1.8.5
>> RHEL 5.4
>> New Dell M610's blade servers with plenty of CPU and RAM
>> All SAN fibre connections are at least 4GB
>> Some notes:
>> - A direct write (dd) from a single OSS to the OST gets 4GB, the  
>> OSS's
>> fibre wire speed.
>> - A single client will get 2GB of lustre write speed, the client's
>> ethernet wire speed.
>> - We've tried bond mode 6 and 0 on all systems.  With mode 6 we will
>> see both NICs on both OSSs receiving data.
>> - We've tried multiple OSTs per OSS.
>> But 2 clients writing a file will get 2GB of total bandwidth to the
>> filesystems.  We have been unable to isolate any particular resource
>> bottleneck.  None of the systems (MDS, OSS, or client) seem to be
>> working very hard.
>> The 1GB per OSS threshold is so consistent, that it almost appears by
>> design - and hopefully we're missing something obvious.
>> Any advice?
>> Thanks.
>> djm
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

More information about the lustre-discuss mailing list