<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Andreas</p>
<p>Thanks for the reply. I should have clarified that my setting <b>stripe_count=rank</b>
was purely for debugging purposes so I could tell which of the 4
ranks actually did set the striping when the test case failed.
Normally, the stripe_count is user selectable, and would be the
same for all ranks. I was hoping that the first rank to get to
the open/set stripe would do what's needed and the later arriving
ranks would just open the already existing striped file. It
doesn't matter which rank gets there first as they all would be
requesting the same striping.<br>
</p>
<p>There are several reasons that llapi_file_open() does not satisfy
my needs. Most notably, when my I/O library intercepts ( using
LD_PRELOAD ) functions such as the mkstemps() family, and some of
the stdio opens, I can't necessarily replicate the open that would
have occurred. This has been discussed at length already on
lustre-discuss. <br>
</p>
<p>Thanks again,<br>
</p>
<p>John<br>
</p>
<br>
<div class="moz-cite-prefix">On 10/26/2016 5:54 PM, Dilger, Andreas
wrote:<br>
</div>
<blockquote
cite="mid:76F4A650-B6EB-4BED-9922-F92650F01C19@intel.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Courier New";
panose-1:2 7 3 9 2 2 5 2 4 4;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:12.0pt;
font-family:"Times New Roman";}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.apple-style-span
{mso-style-name:apple-style-span;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Courier;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:Calibri;
color:windowtext;}
span.msoIns
{mso-style-type:export-only;
mso-style-name:"";
text-decoration:underline;
color:teal;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri">Doing all of
the ioctl() handling directly in your application is not a
great idea, as that will not allow using a bunch of new
features that are in the pipeline (e.g. progressive file
layouts, file level redundancy, etc). It would be a lot
better to use the provided llapi_file_create() or
llapi_layout_*() to isolate your application from the
underlying implementation of how the file layout is set.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri">Specifics about
your implementation:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri">- it is only
possible to set the layout on a file once, when it is first
created, so doing this from multiple threads for a single
shared file is broken. You should do that only from rank 0.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri">- it is
possible to create a separate file for each thread/rank, but
you probably don't want to set the stripe *count* == rank
for each file. it doesn't make sense to create a bunch of
different files for the same application, each one with a
different stripe count. You probably meant to set the
stripe_offset == rank so that the load is spread evenly
across all OSTs?<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri">- as a caveat
for the above, specifying the OST index directly == rank can
cause problems, compared to just allowing the MDT to select
the OST indices for each file itself. If num_ranks <
ost_count then only the first num_ranks OSTs would ever be
used, and space usage on the OSTs would be imbalanced.
Also, if some OST is offline or overloaded your application
would not be able to create new files, while this can be
avoided by allowing the MDT to select the OST index for each
file. With one file per rank it is best to use stripe_count
= 1 for all files, since you already have parallelism at the
application level.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div>
<div>
<div>
<p class="MsoNormal"><span
style="font-size:10.5pt;font-family:Calibri;color:black">Cheers,
Andreas<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span
style="font-size:10.5pt;font-family:Calibri;color:black">-- <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span
style="font-size:10.5pt;font-family:Calibri;color:black">Andreas
Dilger<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span
style="font-size:10.5pt;font-family:Calibri;color:black">Lustre
Principal Architect<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span
style="font-size:10.5pt;font-family:Calibri;color:black">Intel
High Performance Data Division</span><span
style="font-size:11.0pt;font-family:Calibri"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:Consolas;color:black">On
2016/10/26, 06:51, " John Bauer" <<a
moz-do-not-send="true"
href="mailto:bauerj@iodoctors.com">bauerj@iodoctors.com</a>>
wrote:<o:p></o:p></span></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<p style="margin-left:.5in">All<o:p></o:p></p>
<p style="margin-left:.5in">I am running a<span
style="font-family:"Courier New""> 4 rank</span>
MPI job where all the ranks do an open of the file, attempt to
set the striping with ioctl() and then do a small write.
Intermittently, I get errors on the write() and ioctl(). This
is a synthetic test case, boiled down from a much larger real
world job. Note that I set the stripe_count to rank+1 so I
can tell which of the ranks actually set the striping.<o:p></o:p></p>
<p style="margin-left:.5in">I have determined that I only get
the write failure when the ioctl also failed with "No data
available". It also strikes me that at most, only one rank
reports "File exists". With a 4 rank job, I would think that
normal behavior would be 1 rank would work as expected ( no
error ) and the other 3 would report file exists.<o:p></o:p></p>
<p style="margin-left:.5in">Is this expected behavior?<o:p></o:p></p>
<p
style="mso-margin-top-alt:5.0pt;margin-right:0in;margin-bottom:12.0pt;margin-left:.5in"><span
style="font-family:"Courier New";color:red">rank=1
doIO() -1=ioctl(fd=9) No data available<br>
rank=1 doIO() -1=write(fd=9) Bad file descriptor<br>
rank=3 doIO() -1=ioctl(fd=9) File exists</span><o:p></o:p></p>
<p style="margin-left:.5in">oflags = O_CREAT|O_TRUNC|O_RDWR<o:p></o:p></p>
<p
style="mso-margin-top-alt:5.0pt;margin-right:0in;margin-bottom:12.0pt;margin-left:.5in"><span
style="font-family:"Courier New"">void<br>
doIO(const char *fileName, int rank){<br>
int status ;<br>
int fd=open(fileName,
O_RDWR|O_TRUNC|O_CREAT|O_LOV_DELAY_CREATE, 0640 ) ;<br>
if( fd < 0 ) return ;<br>
<br>
struct lov_user_md opts = {0};<br>
opts.lmm_magic = LOV_USER_MAGIC;<br>
opts.lmm_stripe_size = 1048576;<br>
opts.lmm_stripe_offset = -1 ;<br>
opts.lmm_stripe_count = rank+1 ;<br>
opts.lmm_pattern = 0 ;<br>
<br>
status = ioctl ( fd , LL_IOC_LOV_SETSTRIPE, &opts);<br>
if(status<0)fprintf(stderr,"rank=%d %s()
%d=ioctl(fd=%d)
%s\n",rank,__func__,status,fd,strerror(errno));<br>
<br>
char *string = "this is it\n" ;<br>
int nc = strlen(string) ;<br>
status = write( fd, string, nc ) ;<br>
if( status != nc ) fprintf(stderr,"rank=%d %s()
%d=write(fd=%d)
%s\n",rank,__func__,status,fd,status<0?strerror(errno):"");<br>
status = close(fd) ;<br>
if(status<0)fprintf(stderr,"rank=%d %s()
%d=close(fd=%d)
%s\n",rank,__func__,status,fd,strerror(errno));<br>
}</span><o:p></o:p></p>
<pre style="margin-left:.5in">-- <o:p></o:p></pre>
<pre style="margin-left:.5in">I/O Doctors, LLC<o:p></o:p></pre>
<pre style="margin-left:.5in">507-766-0378<o:p></o:p></pre>
<pre style="margin-left:.5in"><a moz-do-not-send="true" href="mailto:bauerj@iodoctors.com">bauerj@iodoctors.com</a><o:p></o:p></pre>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
I/O Doctors, LLC
507-766-0378
<a class="moz-txt-link-abbreviated" href="mailto:bauerj@iodoctors.com">bauerj@iodoctors.com</a></pre>
</body>
</html>