<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
All,<br>
<br>
I have an application that writes a 100GB file forwards, and then
begins a sequence of reading a 70 GB section of the file forwards
and backwards. At some point in the run, <br>
not always at the same point, the read performance degrades
significantly. The initial forward reads are about 1.3 GB/s. The
backwards reads about 300 MB/s. In an instant, <br>
the forward read performance drops to 2.8 MB/s. From about 250
seconds on, this is the only file that is being read or written by
the application, running on a dedicated client node.<br>
The file has a stripe count of 4, and stripe size of 512KB. If
the stripe count is changed to 1, this behavior does not present
itself. The cpu usage is minimal during the period of degraded
performance. <br>
The LNET traffic is also about 2.8 MB/s during the period of
degraded performance. The system has 64GB of memory, meaning Lustre
can not cache the entire 70GB active set of the file that is being
read. <br>
The Lustre client version is 2.9.0.<br>
<br>
Any ideas what could be causing this? What should I be watching in
the /proc/fs/lustre file system to find some clues?<br>
<br>
The behavior is depicted in the image below, which shows the file
position as a function of wall clock time. The writes and reads are
of size 512KB.<br>
<br>
Thanks,<br>
<br>
John<br>
<br>
<br>
<br>
<img src="cid:part1.7E2D78BA.3432AFCC@iodoctors.com" alt="">
<pre class="moz-signature" cols="72">--
I/O Doctors, LLC
507-766-0378
<a class="moz-txt-link-abbreviated" href="mailto:bauerj@iodoctors.com">bauerj@iodoctors.com</a></pre>
</body>
</html>