[Lustre-discuss] bad 1.6.3 striped write performance

Mon Nov 26 15:15:22 PST 2007

On Mon, Nov 26, 2007 at 11:59:32AM -0700, Andreas Dilger wrote:
>On Nov 26, 2007  18:16 +0100, Andrei Maslennikov wrote:
>> Confirmed: 1.6.3 striped write performance sux.
>> 
>> With 1.6.2, I see this:
>> 
>> [root at srvandrei ~]$ lfs setstripe /lustre/162 0 0 3
>> [root at srvandrei ~]$ lmdd.linux of=/lustre/162 bs=1024k time=180 fsync=1
>> 157705.8304 MB in 180.0225 secs, 876.0341 MB/sec
>> 
>> I.e. 1.6.2 had nicely joined the aggregate bw of three OSTs of 300 MB/sec each
>> into the almost 900 MB/sec.
>
>Can you verify that you disabled data checksumming:
>	echo 0 > /proc/fs/lustre/llite/*/checksum_pages

those checksums were off in my runs (they were off by default?).
so I don't think any of the checksums are making a difference.

>Note that there are 2 kinds of checksumming that Lustre does.  The first one
>is checksumming of data in client memory, and the second one is checksumming
>of data over the network.  Setting $LPROC/llite/*/checksum_pages turns on/off
>both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums turns on/off
>the network checksums only.

good to know. thanks. all those are new in 1.6.3?

>If checksums are disabled, can you please report if the CPU usage on the
>client is consuming all of the CPU, or possibly all of a single CPU on 1.6.3
>and on 1.6.2?

with checksums disabled, a 1.6.3+ client looks like:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7437 root      15   0     0    0    0 R   57  0.0   7:31.77 ldlm_poold
18547 rjh900    15   0  5820  504  412 S    3  0.1   0:34.52 dd

which is interesting. ldlm_poold is using an awful lot of cpu.

a 'top' on a 1.6.2 client shows only dd using significant cpu (plus the
usual small percentages for ptlrpcd, kswapd0, pdflush, kiblnd_sd_*)

cheers,
robin

>> On Nov 26, 2007 4:58 PM, Andrei Maslennikov
>> <andrei.maslennikov at gmail.com> wrote:
>> > On Nov 26, 2007 3:32 PM, Robin Humble <rjh+lustre at cita.utoronto.ca> wrote:
>> >
>> > > >> I'm seeing what can only be described as dismal striped write
>> > > >> performance from lustre 1.6.3 clients :-/
>> > > >> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
>> > > >> of days ago) are also terrible.
>> >
>> > I have 3 OSTs capable to deliver 300+ MB/sec each for large streaming writes
>> > with 1M blocksize. On one client, with one OST I can see almost all
>> > this bandwidth over Infiniband. If I run three processes in parallel on this very client,
>> > each writing into a separate OST, I arrive to 520 MB/sec aggregate (3 streams at
>> > approx 170+ MB/sec each).
>> >
>> > If I try to stripe over these three OSTs on this client, performance of one
>> > stream drops to 60+ MB/sec. Changing stripesize to a smaller one (1/3 MB)
>> > makes things worse. Writing with larger block sizes (9M, 30M) does not improve
>> > things. Increasing the stripesize to 25 MB allows to approach the speed
>> > of a single OST, as one would expect (blocks are round robined over all three
>> > OSTs). But never more. Zeroing checksums on the client does not help.
>> >
>> > Will now be downgrading the client to 1.6.2 to see if this helps.
>
>Cheers, Andreas
>--
>Andreas Dilger
>Sr. Staff Engineer, Lustre Group
>Sun Microsystems of Canada, Inc.