[lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window

Patrick Farrell paf at cray.com
Wed Aug 1 20:58:50 PDT 2018


Fair enough.  It doesn’t really make sense in context, but that’s fine since it’s temporary.


________________________________
From: James Simmons <jsimmons at infradead.org>
Sent: Wednesday, August 1, 2018 10:52:21 PM
To: Patrick Farrell
Cc: Andreas Dilger; Oleg Drokin; NeilBrown; Lustre Development List
Subject: Re: [PATCH 06/31] lustre: llite: reduce jobstats race window


> I'm puzzled, James - Why is "cache_jobid" in there?  Isn't that from Ben Evans' work?  This patch landed before all of that...

All back ported patches have the potential to be modified so it can pass
checkpatch as well as perfered standards. One of the common complaints
was that lustre tends to use generic goto lables which can make grepping
of the code more challenging. So I often change generic got lables to
something with more meat. In this case I picked a nice name that came from
a later patch :-)

> ______________________________________________________________________________________________________________________________
> From: James Simmons <jsimmons at infradead.org>
> Sent: Monday, July 30, 2018 9:25:58 PM
> To: Andreas Dilger; Oleg Drokin; NeilBrown
> Cc: Lustre Development List; Patrick Farrell; James Simmons
> Subject: [PATCH 06/31] lustre: llite: reduce jobstats race window
> From: Patrick Farrell <paf at cray.com>
>
> In the current code, lli_jobid is set to zero on every call
> to lustre_get_jobid.  This causes problems, because it's
> used asynchronously to set the job id in RPCs, and some
> RPCs will falsely get no jobid set.  (For small IO sizes,
> this can be up to 60% of RPCs.)
>
> It would be very expensive to put hard synchronization
> between this and every outbound RPC, and it's OK to very
> rarely get an RPC without correct job stats info.
>
> This patch only updates the lli_jobid when the job id has
> changed, which leaves only a very small window for reading
> an inconsistent job id.
>
> Signed-off-by: Patrick Farrell <paf at cray.com>
> WC-id: https://jira.whamcloud.com/browse/LU-8926
> Reviewed-on: https://review.whamcloud.com/24253
> Reviewed-by: Andreas Dilger <adilger at whamcloud.com>
> Reviewed-by: Chris Horn <hornc at cray.com>
> Signed-off-by: James Simmons <jsimmons at infradead.org>
> ---
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +
>  drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------
>  2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
> index c0861b9..72b118a 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
> @@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)
>                  lli->lli_async_rc = 0;
>          }
>          mutex_init(&lli->lli_layout_mutex);
> +       memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);
>  }
>
>  int ll_fill_super(struct super_block *sb)
> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> index cdaf729..87327ef 100644
> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> @@ -95,26 +95,34 @@
>   */
>  int lustre_get_jobid(char *jobid)
>  {
> -       memset(jobid, 0, LUSTRE_JOBID_SIZE);
> +       char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };
> +
>          /* Jobstats isn't enabled */
>          if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)
> -               return 0;
> +               goto out_cache_jobid;
>
>          /* Use process name + fsuid as jobid */
>          if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {
> -               snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",
> +               snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",
>                           current->comm,
>                           from_kuid(&init_user_ns, current_fsuid()));
> -               return 0;
> +               goto out_cache_jobid;
>          }
>
>          /* Whole node dedicated to single job */
>          if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {
> -               strcpy(jobid, obd_jobid_node);
> -               return 0;
> +               strcpy(tmp_jobid, obd_jobid_node);
> +               goto out_cache_jobid;
>          }
>
>          return -ENOENT;
> +
> +out_cache_jobid:
> +       /* Only replace the job ID if it changed. */
> +       if (strcmp(jobid, tmp_jobid) != 0)
> +               strcpy(jobid, tmp_jobid);
> +
> +       return 0;
>  }
>  EXPORT_SYMBOL(lustre_get_jobid);
>
> --
> 1.8.3.1
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180802/0e31096d/attachment.html>


More information about the lustre-devel mailing list