they run on different physical nodes and access the ost via 4x infiniband.<br><br><div class="gmail_quote">On Fri, Aug 21, 2009 at 3:15 PM, di wang <span dir="ltr"><<a href="mailto:di.wang@sun.com">di.wang@sun.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">Alvaro Aguilera wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
thanks for the hint, but unfortunately I can't make any updates to the cluster...<br>
<br>
Do you think both of the problems I experienced are bugs in Lustre and are resolved in current versions?<br>
</blockquote></div>
It should be lustre bugs. The 2 processes runs on different node or same node?<br>
<br>
Thanks<br>
WangDi<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
Thanks.<br>
Alvaro.<div><div></div><div class="h5"><br>
<br>
On Fri, Aug 21, 2009 at 6:32 AM, di wang <<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a> <mailto:<a href="mailto:di.wang@sun.com" target="_blank">di.wang@sun.com</a>>> wrote:<br>
<br>
Hello,<br>
<br>
You may see bug 17197 and try to apply this patch<br>
<a href="https://bugzilla.lustre.org/attachment.cgi?id=25062" target="_blank">https://bugzilla.lustre.org/attachment.cgi?id=25062</a> to your<br>
lustre src. Or you can wait 1.8.2.<br>
<br>
Thanks<br>
Wangdi<br>
<br>
Alvaro Aguilera wrote:<br>
<br>
Hello,<br>
<br>
as a project for college I'm doing a behavioral comparison<br>
between Lustre and CXFS when dealing with simple strided files<br>
using POSIX semantics. On one of the tests, each participating<br>
process reads 16 chunks of data with a size of 32MB each, from<br>
a common, strided file using the following code:<br>
<br>
------------------------------------------------------------------------------------------<br>
int myfile = open("thefile", O_RDONLY);<br>
<br>
MPI_Barrier(MPI_COMM_WORLD); // the barriers are only to help<br>
measuring time<br>
<br>
off_t distance = (numtasks-1)*p.buffersize;<br>
off_t offset = rank*p.buffersize;<br>
<br>
int j;<br>
lseek(myfile, offset, SEEK_SET);<br>
for (j = 0; j < p.buffercount; j++) {<br>
read(myfile, buffers[j], p.buffersize); // buffers are<br>
aligned to the page size<br>
lseek(myfile, distance, SEEK_CUR);<br>
}<br>
<br>
MPI_Barrier(MPI_COMM_WORLD);<br>
<br>
close(myfile);<br>
------------------------------------------------------------------------------------------<br>
<br>
I'm facing the following problem: when this code is run in<br>
parallel the read operations on certain processes start to<br>
need more and more time to complete. I attached a graphical<br>
trace of this, when using only 2 processes.<br>
As you see, the read operations on process 0 stay more or less<br>
constant, taking about 0.12 seconds to complete, while on<br>
process 1 they increase up to 39 seconds!<br>
<br>
If I run the program with only one process, then the time<br>
stays at ~0.12 seconds per read operation. The problem doesn't<br>
appear if the O_DIRECT flag is used.<br>
<br>
Can somebody explain to me why is this happening? Since I'm<br>
very new to Lustre, I may be making some silly mistakes, so be<br>
nice to me ;)<br>
<br>
I'm using Lustre SLES 10 Patchlevel 1, Kernel<br>
2.6.16.54-0.2.5_lustre.1.6.5.1.<br>
<br>
<br>
Thanks!<br>
<br>
Alvaro Aguilera.<br>
<br>
<br>
------------------------------------------------------------------------<br>
<br>
------------------------------------------------------------------------<br>
<br>
<br>
<br>
_______________________________________________<br>
Lustre-discuss mailing list<br>
<a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a><br></div></div>
<mailto:<a href="mailto:Lustre-discuss@lists.lustre.org" target="_blank">Lustre-discuss@lists.lustre.org</a>><div class="im"><br>
<a href="http://lists.lustre.org/mailman/listinfo/lustre-discuss" target="_blank">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a><br>
<br>
<br>
<br>
</div></blockquote>
<br>
</blockquote></div><br>