[lustre-discuss] varying sequential read performance.

John Bauer bauerj at iodoctors.com
Mon Apr 2 23:23:30 PDT 2018


Colin

Since I do not have root privileges on the system, I do not have access 
to dropcache.  So, no, I do not flush cache between the dd runs.  The 10 
dd runs were done in a single
job submission and the scheduler does dropcache between jobs, so the 
first of the dd passes does start with a virgin cache.  What strikes me 
odd about this is the first dd
run is the slowest and obviously must read all the data from the OSSs, 
which is confirmed by the plot I have added to the top, which indicates 
the total amount of data moved
via lnet during the life of each dd process.  Notice that the second dd 
run, which lnetstats indicates also moves the entire 64 GB file from the 
OSSs, is 3 times faster, and has
to work with a non-virgin cache.  Runs 4 through 10 all move only 48GB 
via lnet because one of the OSCs keeps its entire 16GB that is needed in 
cache across all the runs.
Even with the significant advantage that runs 4-10 have, you could never 
tell in the dd results.  Run 5 is slightly faster than run 2, and run 7 
is as slow as run 0.

John




On 4/3/2018 12:20 AM, Colin Faber wrote:
> Are you flushing cache between test runs?
>
> On Mon, Apr 2, 2018, 6:06 PM John Bauer <bauerj at iodoctors.com 
> <mailto:bauerj at iodoctors.com>> wrote:
>
>     I am running dd 10 times consecutively to  read a 64GB file (
>     stripeCount=4 stripeSize=4M ) on a Lustre client(version 2.10.3)
>     that has 64GB of memory.
>     The client node was dedicated.
>
>     *for pass in 1 2 3 4 5 6 7 8 9 10
>     do
>        of=/dev/null if=${file} count=128000 bs=512K
>     done
>     *
>     Instrumentation of the I/O from dd reveals varying performance. 
>     In the plot below, the bottom frame has wall time
>     on the X axis, and file position of the dd reads on the Y axis,
>     with a dot plotted at the wall time and starting file position of
>     every read.
>     The slopes of the lines indicate the data transfer rate, which
>     vary from 475MB/s to 1.5GB/s.  The last 2 passes have sharp breaks
>     in the performance, one with increasing performance, and one with
>     decreasing performance.
>
>     The top frame indicates the amount of memory used by each of the
>     file's 4 OSCs over the course of the 10 dd runs. Nothing terribly
>     odd here except that
>     one of the OSC's eventually has its entire stripe ( 16GB ) cached
>     and then never gives any up.
>
>     I should mention that the file system has 320 OSTs.  I found
>     LU-6370 which eventually started discussing LRU management issues
>     on systems with high
>     numbers of OST's leading to reduced RPC sizes.
>
>     Any explanations for the varying performance?
>     Thanks,
>     John
>
>     -- 
>     I/O Doctors, LLC
>     507-766-0378
>     bauerj at iodoctors.com <mailto:bauerj at iodoctors.com>
>
>     _______________________________________________
>     lustre-discuss mailing list
>     lustre-discuss at lists.lustre.org
>     <mailto:lustre-discuss at lists.lustre.org>
>     http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>

-- 
I/O Doctors, LLC
507-766-0378
bauerj at iodoctors.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180403/f054f871/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dkknkjhpjkppjice.png
Type: image/png
Size: 37761 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180403/f054f871/attachment-0001.png>


More information about the lustre-discuss mailing list