<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body>
Fair enough.  It doesn’t really make sense in context, but that’s fine since it’s temporary.<br>
<br>
<br>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> James Simmons <jsimmons@infradead.org><br>
<b>Sent:</b> Wednesday, August 1, 2018 10:52:21 PM<br>
<b>To:</b> Patrick Farrell<br>
<b>Cc:</b> Andreas Dilger; Oleg Drokin; NeilBrown; Lustre Development List<br>
<b>Subject:</b> Re: [PATCH 06/31] lustre: llite: reduce jobstats race window</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText"><br>
> I'm puzzled, James - Why is "cache_jobid" in there?  Isn't that from Ben Evans' work?  This patch landed before all of that...<br>
<br>
All back ported patches have the potential to be modified so it can pass <br>
checkpatch as well as perfered standards. One of the common complaints <br>
was that lustre tends to use generic goto lables which can make grepping <br>
of the code more challenging. So I often change generic got lables to <br>
something with more meat. In this case I picked a nice name that came from <br>
a later patch :-)<br>
 <br>
> ______________________________________________________________________________________________________________________________<br>
> From: James Simmons <jsimmons@infradead.org><br>
> Sent: Monday, July 30, 2018 9:25:58 PM<br>
> To: Andreas Dilger; Oleg Drokin; NeilBrown<br>
> Cc: Lustre Development List; Patrick Farrell; James Simmons<br>
> Subject: [PATCH 06/31] lustre: llite: reduce jobstats race window  <br>
> From: Patrick Farrell <paf@cray.com><br>
> <br>
> In the current code, lli_jobid is set to zero on every call<br>
> to lustre_get_jobid.  This causes problems, because it's<br>
> used asynchronously to set the job id in RPCs, and some<br>
> RPCs will falsely get no jobid set.  (For small IO sizes,<br>
> this can be up to 60% of RPCs.)<br>
> <br>
> It would be very expensive to put hard synchronization<br>
> between this and every outbound RPC, and it's OK to very<br>
> rarely get an RPC without correct job stats info.<br>
> <br>
> This patch only updates the lli_jobid when the job id has<br>
> changed, which leaves only a very small window for reading<br>
> an inconsistent job id.<br>
> <br>
> Signed-off-by: Patrick Farrell <paf@cray.com><br>
> WC-id: <a href="https://jira.whamcloud.com/browse/LU-8926">https://jira.whamcloud.com/browse/LU-8926</a><br>
> Reviewed-on: <a href="https://review.whamcloud.com/24253">https://review.whamcloud.com/24253</a><br>
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com><br>
> Reviewed-by: Chris Horn <hornc@cray.com><br>
> Signed-off-by: James Simmons <jsimmons@infradead.org><br>
> ---<br>
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +<br>
>  drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------<br>
>  2 files changed, 15 insertions(+), 6 deletions(-)<br>
> <br>
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
> index c0861b9..72b118a 100644<br>
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c<br>
> @@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)<br>
>                  lli->lli_async_rc = 0;<br>
>          }<br>
>          mutex_init(&lli->lli_layout_mutex);<br>
> +       memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);<br>
>  }<br>
>  <br>
>  int ll_fill_super(struct super_block *sb)<br>
> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
> index cdaf729..87327ef 100644<br>
> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c<br>
> @@ -95,26 +95,34 @@<br>
>   */<br>
>  int lustre_get_jobid(char *jobid)<br>
>  {<br>
> -       memset(jobid, 0, LUSTRE_JOBID_SIZE);<br>
> +       char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };<br>
> +<br>
>          /* Jobstats isn't enabled */<br>
>          if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)<br>
> -               return 0;<br>
> +               goto out_cache_jobid;<br>
>  <br>
>          /* Use process name + fsuid as jobid */<br>
>          if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {<br>
> -               snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",<br>
> +               snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",<br>
>                           current->comm,<br>
>                           from_kuid(&init_user_ns, current_fsuid()));<br>
> -               return 0;<br>
> +               goto out_cache_jobid;<br>
>          }<br>
>  <br>
>          /* Whole node dedicated to single job */<br>
>          if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {<br>
> -               strcpy(jobid, obd_jobid_node);<br>
> -               return 0;<br>
> +               strcpy(tmp_jobid, obd_jobid_node);<br>
> +               goto out_cache_jobid;<br>
>          }<br>
>  <br>
>          return -ENOENT;<br>
> +<br>
> +out_cache_jobid:<br>
> +       /* Only replace the job ID if it changed. */<br>
> +       if (strcmp(jobid, tmp_jobid) != 0)<br>
> +               strcpy(jobid, tmp_jobid);<br>
> +<br>
> +       return 0;<br>
>  }<br>
>  EXPORT_SYMBOL(lustre_get_jobid);<br>
>  <br>
> --<br>
> 1.8.3.1<br>
> <br>
> <br>
> </div>
</span></font></div>
</body>
</html>