hi,<br><br>here is the requested information:<br><br>before test:<br><br>llite.fastfs-ffff810102a6a400.read_ahead_stats=<br>snapshot_time:         1251851453.382275 (secs.usecs)<br>pending issued pages:           0<br>hits                      7301235<br>


misses                    10546<br>readpage not consecutive  14369<br>miss inside window        1<br>failed grab_cache_page    6285314<br>failed lock match         0<br>read but discarded        98955<br>zero length file          0<br>


zero size window          3495<br>read-ahead to EOF         172<br>hit max r-a issue         783042<br>wrong page from grab_cache_page 0<br><br><br>after:<br><br>llite.fastfs-ffff810102a6a400.read_ahead_stats=<br>snapshot_time:         1251851620.183964 (secs.usecs)<br>


pending issued pages:           0<br>hits                      7506005<br>misses                    330064<br>readpage not consecutive  14432<br>miss inside window        319450<br>failed grab_cache_page    6322954<br>failed lock match         17294<br>


read but discarded        98955<br>zero length file          0<br>zero size window          3495<br>read-ahead to EOF         192<br>hit max r-a issue         837908<br>wrong page from grab_cache_page 0<br><br><br>there seems to by a lot of misses, as well as a locking problem, doesn't it? Btw. in the test, 4 processes read 512mb each from a 2gb big file.<br>


<br>Regards,<br>Alvaro.<br><br><div class="gmail_quote">On Fri, Aug 21, 2009 at 3:38 PM, di wang <span dir="ltr"><<a href="mailto:di.wang@sun.com">di.wang@sun.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


hello,<div class="im"><br>

Alvaro Aguilera wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

they run on different physical nodes and access the ost via 4x infiniband.<br>

<br>

</blockquote></div>

I never heard such problems, if they on different nodes.  Client memory?<br>

Can you post  read-ahead  stats (before and after the test)  here by<br>

<br>

lctl get_param llite.*.read_ahead_stats<br>

<br>

<br>

But there are indeed a lot fixes about stride read since 1.6.5, which is included in the tar ball I posted below.<br>

And it probably can fix your problem.<br>

<br>

Thanks<br>

WangDi<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">

On Fri, Aug 21, 2009 at 3:15 PM, di wang <<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a> <mailto:<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a>>> wrote:<br>

<br>

    Alvaro Aguilera wrote:<br>

<br>

        thanks for the hint, but unfortunately I can't make any<br>

        updates to the cluster...<br>

<br>

        Do you think both of the problems I experienced are bugs in<br>

        Lustre and are resolved in current versions?<br>

<br>

    It should be lustre bugs. The 2 processes runs on different node<br>

    or same node?<br>

<br>

    Thanks<br>

    WangDi<br>

<br>

<br>

        Thanks.<br>

        Alvaro.<br>

<br>

<br>

        On Fri, Aug 21, 2009 at 6:32 AM, di wang <<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a><br></div>

        <mailto:<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a>> <mailto:<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a><div><div></div><div class="h5"><br>

        <mailto:<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a>>>> wrote:<br>

<br>

           Hello,<br>

<br>

           You may see bug 17197 and try to apply this patch<br>

           <a href="https://bugzilla.lustre.org/attachment.cgi?id=25062" target="_blank">https://bugzilla.lustre.org/attachment.cgi?id=25062</a>  to your<br>

           lustre src. Or you can wait 1.8.2.<br>

<br>

           Thanks<br>

           Wangdi<br>

<br>

           Alvaro Aguilera wrote:<br>

<br>

               Hello,<br>

<br>

               as a project for college I'm doing a behavioral comparison<br>

               between Lustre and CXFS when dealing with simple<br>

        strided files<br>

               using POSIX semantics. On one of the tests, each<br>

        participating<br>

               process reads 16 chunks of data with a size of 32MB<br>

        each, from<br>

               a common, strided file using the following code:<br>

<br>

                      ------------------------------------------------------------------------------------------<br>

               int myfile = open("thefile", O_RDONLY);<br>

<br>

               MPI_Barrier(MPI_COMM_WORLD); // the barriers are only<br>

        to help<br>

               measuring time<br>

<br>

               off_t distance = (numtasks-1)*p.buffersize;<br>

               off_t offset = rank*p.buffersize;<br>

<br>

               int j;<br>

               lseek(myfile, offset, SEEK_SET);<br>

               for (j = 0; j < p.buffercount; j++) {<br>

                     read(myfile, buffers[j], p.buffersize); //<br>

        buffers are<br>

               aligned to the page size<br>

                     lseek(myfile, distance, SEEK_CUR);<br>

               }<br>

<br>

               MPI_Barrier(MPI_COMM_WORLD);<br>

<br>

               close(myfile);<br>

                      ------------------------------------------------------------------------------------------<br>

<br>

               I'm facing the following problem: when this code is run in<br>

               parallel the read operations on certain processes start to<br>

               need more and more time to complete. I attached a graphical<br>

               trace of this, when using only 2 processes.<br>

               As you see, the read operations on process 0 stay more<br>

        or less<br>

               constant, taking about 0.12 seconds to complete, while on<br>

               process 1 they increase up to 39 seconds!<br>

<br>

               If I run the program with only one process, then the time<br>

               stays at ~0.12 seconds per read operation. The problem<br>

        doesn't<br>

               appear if the O_DIRECT flag is used.<br>

<br>

               Can somebody explain to me why is this happening? Since I'm<br>

               very new to Lustre, I may be making some silly<br>

        mistakes, so be<br>

               nice to me ;)<br>

<br>

               I'm using Lustre SLES 10 Patchlevel 1, Kernel<br>

               2.6.16.54-0.2.5_lustre.1.6.5.1.<br>

<br>

<br>

               Thanks!<br>

<br>

               Alvaro Aguilera.<br>

<br>

<br>

                      ------------------------------------------------------------------------<br>

<br>

                      ------------------------------------------------------------------------<br>

<br>

<br>

<br>

               _______________________________________________<br>

               Lustre-discuss mailing list<br>

               <a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a><br>

        <mailto:<a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a>><br>

               <mailto:<a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a><br>

        <mailto:<a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a>>><br>

<br>

               <a href="http://lists.lustre.org/mailman/listinfo/lustre-discuss" target="_blank">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a><br>

               <br>

<br>

<br>

<br>

</div></div></blockquote>

<br>

</blockquote></div><br>