[Lustre-discuss] Lustre v1.8.0.1 slower than expected large-file, sequential-buffered-file-read speed

Tue Aug 4 15:08:20 PDT 2009

On Aug 04, 2009  10:30 -0400, Rick Rothstein wrote:
> I'm new to Lustre (v1.8.0.1), and I've verified that
> I can get about 1000-megabytes-per-second aggregate throughput
> for large file sequential reads using direct-I/O.
> (only limited by the speed of my 10gb NIC with TCP offload engine).
> 
> the above direct-I/O "dd" tests achieve about a 1000-megabyte-per-second
> aggregate throughput, but when I try the same tests with normal buffered
> I/O, (by just running "dd" without "iflag=direct"), the runs
> only get about a 550-megabyte-per-second aggregate throughput.
> 
> I suspect that this slowdown may have something to do with
> client-side-caching, but normal buffered reads have not speeded up,
> even after I've tried such adjustments as:

Note that there is a significant CPU overhead on the client when using
buffered IO, simply due to CPU usage from copying the data between
userspace and the kernel.  Having multiple cores on the client (one
per dd process) allows distributing this copy overhead between cores.

You could also run "oprofile" to see if there is anything else of
interest that is consuming a lot of CPU.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.