[Lustre-discuss] 1GB throughput limit on OST (1.8.5)?

David Merhar merhar at arlut.utexas.edu
Thu Jan 27 05:17:29 PST 2011

Our OSS's with 2x1GB NICs (bonded) appear limited to 1GB worth of  
write throughput each.

Our setup:
2 OSS serving 1 OST each
Lustre 1.8.5
RHEL 5.4
New Dell M610's blade servers with plenty of CPU and RAM
All SAN fibre connections are at least 4GB

Some notes:
- A direct write (dd) from a single OSS to the OST gets 4GB, the OSS's  
fibre wire speed.
- A single client will get 2GB of lustre write speed, the client's  
ethernet wire speed.
- We've tried bond mode 6 and 0 on all systems.  With mode 6 we will  
see both NICs on both OSSs receiving data.
- We've tried multiple OSTs per OSS.

But 2 clients writing a file will get 2GB of total bandwidth to the  
filesystems.  We have been unable to isolate any particular resource  
bottleneck.  None of the systems (MDS, OSS, or client) seem to be  
working very hard.

The 1GB per OSS threshold is so consistent, that it almost appears by  
design - and hopefully we're missing something obvious.

Any advice?



