[Lustre-discuss] Single client performance

Peter Grandi pg_lus at lus.for.sabi.co.UK
Fri Mar 12 01:25:35 PST 2010


> I've got a Lustre system where I'd like to improve single
> client write performance, [ ... ] indicates single client
> write performance of around 400 MB/s.

The 2GB/s happens on larger clusters, and the 400MB/s is a bit
conservative. But it depends whether the single client is doing
streaming IO, or random IO, and with one or multiple processes,
and with small or large files, and with small or large writes,
and with frequent or infrequent 'fsync' and 'flock', and so
on. The performance profile of most file systems, local or shared
or distributed, is very anisotropic, mostly because that of
storage systems is also very anisotropic.

> On my system, I can barely get 100 MB/s for writes (measured
> with iozone).

If a performance test is done with 'iozone' usually the person
doing it is not measuring the right things. Indeed if that person
thinks that "single client write performance" is a concept with
general applicability that seems to me very unlikely.

It would be more interesting anyhow to start with something
real simple like:

  dd bs=1M count=1000 conv=fsync if=/dev/zero of=/lustre/1G 

> I'm using Lustre 1.8.2-ext4 on RHEL5 x86_64. I've got four
> OSSs each with one OST, in hardware RAID 6.

Even if often recommended by Lustre people, RADI6 has a rather
anisotropic performance profile (put another way: it is good only
in a few special cases). I'll repeat here the usual link to

  http://WWW.BAARF.com/

> Lustre runs on a 10 Gb network between the servers and
> clients. For the OSSs, iozone tells me that I can write into
> ext4 on the RAID arrays I configured at 750 MB/s.  Indeed, the
> write peformance of ext4 vs. ext3 is one reason I'm using
> 1.8.2-ext4. I can also confirm (using ttcp) that the 10 Gb
> network between the clients and severs can do 750 MB/s.  [
> ... ]

> The goal here is to use a Lustre file system as a sort of
> buffer in a data acquisition system.  The file system needs to
> support 4 - 24 input streams writing at 250 MB/s.

Each? In aggregate? How large is the typical file written (that
is how many metadata ops/s)? How large is the typical write (that
is, how many IOPS)? Do you have any idea how important are these
details?

Single client single stream and single client multistream are
very very different things. And small write and large file and
write profiles are exceedingly different too, never mind all the
funny things that applications can do to minimize performance.

> I realize that Lustre wasn't designed for this sort of workload.

Depending on what the workload is, consider for example my list
of some of the possible dimensions of performance:

  http://www.sabi.co.uk/blog/0804apr.html#080415

But Lustre seems it was designed for something similar enough to
DAQ systems; the issues are whether (primary) your DAQ code, and
then (secondary) the storage subsystem and then (tertiary) the
Lustre config are designed to deliver good 4-24 stream write
rates (whatever that means).

In general this is yet another example of a common type of
posting to this and other storage discussions, where it is
assumed that "it should just work" (what I call the "syntactic
approach"), and that numbers are interchangeable. Consider for
example a previous not too dissimilar query:

  http://www.mail-archive.com/lustre-discuss%40lists.lustre.org/msg05882.html



More information about the lustre-discuss mailing list