<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">
<p style="margin-top:0;margin-bottom:0">I'm puzzled, James - Why is "cache_jobid" in there?  Isn't that from Ben Evans' work?  This patch landed before all of that...</p>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> James Simmons <jsimmons@infradead.org><br>
<b>Sent:</b> Monday, July 30, 2018 9:25:58 PM<br>
<b>To:</b> Andreas Dilger; Oleg Drokin; NeilBrown<br>
<b>Cc:</b> Lustre Development List; Patrick Farrell; James Simmons<br>
<b>Subject:</b> [PATCH 06/31] lustre: llite: reduce jobstats race window</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">From: Patrick Farrell <paf@cray.com><br>
<br>
In the current code, lli_jobid is set to zero on every call<br>
to lustre_get_jobid.  This causes problems, because it's<br>
used asynchronously to set the job id in RPCs, and some<br>
RPCs will falsely get no jobid set.  (For small IO sizes,<br>
this can be up to 60% of RPCs.)<br>
<br>
It would be very expensive to put hard synchronization<br>
between this and every outbound RPC, and it's OK to very<br>
rarely get an RPC without correct job stats info.<br>
<br>
This patch only updates the lli_jobid when the job id has<br>
changed, which leaves only a very small window for reading<br>
an inconsistent job id.<br>
<br>
Signed-off-by: Patrick Farrell <paf@cray.com><br>
WC-id: <a href="https://jira.whamcloud.com/browse/LU-8926">https://jira.whamcloud.com/browse/LU-8926</a><br>
Reviewed-on: <a href="https://review.whamcloud.com/24253">https://review.whamcloud.com/24253</a><br>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com><br>
Reviewed-by: Chris Horn <hornc@cray.com><br>
Signed-off-by: James Simmons <jsimmons@infradead.org><br>
---<br>
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +<br>
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------<br>
 2 files changed, 15 insertions(+), 6 deletions(-)<br>
<br>
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
index c0861b9..72b118a 100644<br>
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
@@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)<br>
                 lli->lli_async_rc = 0;<br>
         }<br>
         mutex_init(&lli->lli_layout_mutex);<br>
+       memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);<br>
 }<br>
 <br>
 int ll_fill_super(struct super_block *sb)<br>
diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
index cdaf729..87327ef 100644<br>
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
@@ -95,26 +95,34 @@<br>
  */<br>
 int lustre_get_jobid(char *jobid)<br>
 {<br>
-       memset(jobid, 0, LUSTRE_JOBID_SIZE);<br>
+       char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };<br>
+<br>
         /* Jobstats isn't enabled */<br>
         if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)<br>
-               return 0;<br>
+               goto out_cache_jobid;<br>
 <br>
         /* Use process name + fsuid as jobid */<br>
         if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {<br>
-               snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",<br>
+               snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",<br>
                          current->comm,<br>
                          from_kuid(&init_user_ns, current_fsuid()));<br>
-               return 0;<br>
+               goto out_cache_jobid;<br>
         }<br>
 <br>
         /* Whole node dedicated to single job */<br>
         if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {<br>
-               strcpy(jobid, obd_jobid_node);<br>
-               return 0;<br>
+               strcpy(tmp_jobid, obd_jobid_node);<br>
+               goto out_cache_jobid;<br>
         }<br>
 <br>
         return -ENOENT;<br>
+<br>
+out_cache_jobid:<br>
+       /* Only replace the job ID if it changed. */<br>
+       if (strcmp(jobid, tmp_jobid) != 0)<br>
+               strcpy(jobid, tmp_jobid);<br>
+<br>
+       return 0;<br>
 }<br>
 EXPORT_SYMBOL(lustre_get_jobid);<br>
 <br>
-- <br>
1.8.3.1<br>
<br>
</div>
</span></font></div>
</body>
</html>