[Lustre-discuss] How to achieve 20GB/s file system throughput?

Sat Jul 24 06:08:32 PDT 2010

Hate to reply to myself ... not an advertisement

On 07/23/2010 10:50 PM, Joe Landman wrote:
> On 07/23/2010 10:25 PM, Henry_Xu at Dell.com wrote:

[...]

> It is possible to achieve 20GB/s, and quite a bit more, using Lustre.
> As to whether or not that 20GB/s is meaningful to their code(s), thats a
> different question.  It would be 20GB/s in aggregate, over possibly many
> compute nodes doing IO.

I should point out that we have customers with 20GB/s maximum 
theoretical configs (best case scenarios) with our siCluster 
(http://scalableinformatics.com/sicluster), with 8 IO units.  Their 
write patterns and Infiniband configurations don't seem to allow 
achieving this in practice.  Simple benchmark tests (mixtures of llnl 
mpi-io, io-bm, iozone, ...) show sustained results north of 12 GB/s for 
them.

Again, to set expectations, most users codes never utilize storage 
systems very effectively, hence you might design a 20GB/s storage 
system, and the IO being done might not hit much above 500 MB/s for 
single threads.

>> My assumption is 100 or more IO nodes(rack servers) are needed.
> Hmmm ... If you can achieve 500+ MB/s per OST, then you would need about
> 40 OSTs.  You can have each OSS handle several OSTs.  There are
> efficiency losses you should be aware of, but 20GB/s using some
> mechanism to measure this, should be possible with a realistic number of
> units.  Don't forget to count efficiency losses in the design.

We do this in 8 machines (theoretical max performance), and could put 
this in a single rack.  We prefer to break it out among more IO nodes, 
say 16-24 smaller nodes, with 2-3 OSTs per OSS (e.g. IO node).

My comments are to make sure your customer understands the efficiency 
issues, and that simple fortran writes from a single thread aren't going 
to be done at 20GB/s.  That is, not unlike a compute cluster, a storage 
cluster has an aggregate bandwidth, that a single node or reader/writer 
cannot achieve on its own.

Regards,

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615