[lustre-devel] lprocfs Helper Issues

Mon Oct 5 14:52:06 PDT 2015

I don't recall any proc files that accepted pages.

I certainly remember callers sending a multiplier that resulted in the 
string parsing function doing the conversion to pages.  But those very 
callers are the ones that break when the string specifies a unit.  The 
code authors failed to realize that, so this would let us avoid that 
mistake in the future.

I don't think it is terribly onerous to do the bit shifting in the 
caller instead of passing the multiplier as parameter to the string 
conversion function.  I think it might even make the code a little more 
readable.

Chris

On 10/03/2015 02:54 AM, Dilger, Andreas wrote:
> What if the default unit is in pages?  I don't think there is a common suffix for pages.
>
> Cheers, Andreas
>
>> On Oct 2, 2015, at 18:29, Christopher J. Morrone <morrone2 at llnl.gov> wrote:
>>
>> It also occurs to me that we shouldn't really have a an integer multiplier parameter in the function definition.  There should just be a simple "char default_units" parameter.  The caller can specify which character is the default, and then there is no possibility of the multiplier being misused the way it currently is.
>>
>> Chris
>>
>>> On 09/30/2015 05:32 PM, Christopher J. Morrone wrote:
>>>> On 09/30/2015 12:46 PM, Di Natale, Giuseppe wrote:
>>>> I looked around to see where the helpers are used. It looks to me that
>>>> they are always used in proc related functions. I agree with the
>>>> issues you mentioned at the top of the email as well.
>>>
>>> Yes, but I meant to say we need to consider future use.  Largely
>>> motivated by the effort to upstream the Lustre client into Linux, the
>>> /proc interfaces are slowly going away.  So I was just suggesting that
>>> we should check that these functions will still be used by the new
>>> debugfs/sysfs/whatever interfaces in the future.  Nothing really needed
>>> to consider though; they are generic enough to still be used well into
>>> the future.
>>>
>>> With that in mind, we should probably change the function names even
>>> further.  Instead of naming the functions after the current primary
>>> callers, we should name them according to what they do.  Perhaps
>>> something along the lines of: str_to_u64().
>>>
>>> That is just good naming practice anyway.  When you are reading code and
>>> you hit a call to a function named "helper", that doesn't give you much
>>> of a clue as to what it does.
>>>
>>>> It also appears that the multiplier is never negative, so making it an
>>>> unsigned int seems like the right way to go. I also noticed that
>>>> lprocfs_write_frac_helper calls are always followed by a check if the
>>>> value produced is greater than 0 and if not an error is thrown. This
>>>> implies the signed version may not be necessary (and that maybe the
>>>> unsigned helper should error with strings representing negative
>>>> numbers?).
>>>>
>>>> Also, I think both the unsigned and signed methods parse the numeric
>>>> portion of the string in a very similar fashion. At the very least,
>>>> they appear to attempt to do the same thing. That is the portion I was
>>>> going to consolidate down.
>>>
>>> Yes, the _u64_ functions almost certainly started out as a cut-and-paste
>>> of the other functions.
>>>
>>>> You didn't comment on detecting overflow/underflow/wrapping. I still
>>>> think those are a valid concern as well.
>>>
>>> I agree.  It is completely reasonable to add those checks during the
>>> refactoring.
>>>
>>>> Giuseppe
>>>> ________________________________________
>>>> From: lustre-devel [lustre-devel-bounces at lists.lustre.org] on behalf
>>>> of Christopher J. Morrone [morrone2 at llnl.gov]
>>>> Sent: Tuesday, September 29, 2015 7:25 PM
>>>> To: lustre-devel at lists.lustre.org
>>>> Subject: Re: [lustre-devel] lprocfs Helper Issues
>>>>
>>>> I looked through the code a bit, and I think that the even bigger issues
>>>> are the lack of reasonable naming, lack of comments, and a puzzling
>>>> semantic inconsistency.
>>>>
>>>> Before working on any of the issues I mention below though, we should
>>>> probably make sure that these functions still have a purpose once /proc
>>>> goes away.  They seem generic enough helpers that they will still be
>>>> used with the non-/proc methods, but it is worth checking.
>>>>
>>>> First, consider this pair of names:
>>>>
>>>> lprocfs_write_frac_helper
>>>> lprocfs_write_frac_u64_helper
>>>>
>>>> One might reasonably suspect that the major difference between these two
>>>> functions is that the latter deails with a u64, and the former does not.
>>>>    But that is already pretty darn clear from the full function
>>>> prototype, and really the main difference is that
>>>> lprocfs_write_frac_u64_helper can parse a units character in the string.
>>>>    Maybe the name should be lprocfs_write_frac_units_helper.
>>>>
>>>> Also, the semantics surrounding the multiplier are field are different.
>>>>    For lprocfs_write_frac_helper the mult parameter is always enforced,
>>>> and there is code that replies on that.  With
>>>> lprocfs_write_frac_u64_helper mult is merely a default, and the caller
>>>> can't rely on it being used.  Two functions with similar names but
>>>> difference semantics (and not in the way implied by the name
>>>> difference), and no function comments...not a good idea.
>>>>
>>>> Next there is the naming difference between these:
>>>>
>>>> lprocfs_write_frac_u64_helper
>>>> lprocfs_write_u64_helper
>>>>
>>>> One might reasonably expect that when using the latter function one
>>>> loses the ability to handle fractional numbers.  But no, actually it
>>>> just sets the multiplier to a default of 1.  How does that naming make
>>>> any sense?
>>>>
>>>> I suppose that with lprocfs_write_helper that naming style almost makes
>>>> sense, because a multiplier of 1 will result in anything after the
>>>> decimal point being calculated out to 0.  So  the fractional part is, in
>>>> effect, ignored.  But, strangely enough, fractions are still _accepted_
>>>> by the function.
>>>>
>>>> This semantic distinction is important to consider.  It means that you
>>>> can't just do a naive combination of the two functions into your
>>>> proposed lprocfs_write_frac_helper_internal() function.  There are
>>>> callers of lprocfs_write_frac_helper that assume that the multiplier can
>>>> be only the one specified and would result in incorrect number if there
>>>> were user-specified units in the string.
>>>>
>>>> By the was, I'm not really in favor of duplicating the existing
>>>> functions with a hope to remove the old ones at some time in the future.
>>>>    I think (despite current evidence to the contrary) that these
>>>> functions are not so difficult to review that we would need a transition
>>>> period.  We would just need to audit every caller to make sure that any
>>>> semantic changes are handled.
>>>>
>>>> And frankly, there would appear to be code that _already_ gets this
>>>> wrong, so an audit is really needed already.  For instance,
>>>> ll_max_readahead_mb_seq_write() tries to be too clever by assuming that
>>>> users can only provide an integer that represents number of MiB, and
>>>> then passes in a multiplier that will have it convert into number of
>>>> pages.  But the since they used lprocfs_write_frac_u64_helper(), the
>>>> user can specify their own units, and then the number returned number
>>>> will be bytes instead of pages.  Here are the callers that I found that
>>>> are doing it wrong:
>>>>
>>>>     ll_max_readahead_mb_seq_write
>>>>     ll_max_cached_mb_seq_write
>>>>     proc_max_dirty_pages_in_mb
>>>>     osc_cached_mb_seq_write
>>>>
>>>> 4 out of 5 are doing it wrong.  Not a good track record.
>>>>
>>>> And getting back to the point, franctions and units are both accepted
>>>> and handled by lprocfs_write_u64_helper, so the lack of "_frac_" in the
>>>> name is misleading at best.
>>>>
>>>> Why does lprocfs_write_frac_helper do its own handling of negatives and
>>>> then call the unsigned simple_strtoul function?  Why not just use the
>>>> signed simple_strtol function?  As far as I can tell the signed version
>>>> of the function has been in Linux as long as the unsigned version.
>>>>
>>>> Why is mult declared as a signed int?  I think it should almost
>>>> certainly be unsigned.  I think it might only be signed because the
>>>> function author reuses mult as a local variable to stored the negative
>>>> sign when parsed from the string.  If so, that is a poor choice.  The
>>>> function declaration is a contract with the caller.  If it makes no
>>>> sense to pass in a negative multiplier, then the declaration should make
>>>> that clear by declaring it unsigned.
>>>>
>>>> Why do the unsigned versions of the helper functions allow and parse
>>>> negative numbers?  I think that this gets to the heart of your
>>>> suggestion about special handling for -1.  I think that knowing that -1
>>>> has special meaning for something things is too specialized for the
>>>> helper function.  I think we are better off letting the caller decided
>>>> what special handling to do and when.
>>>>
>>>> I would suggest that the main helper that does handling of
>>>> user-specified units should not be casting the number to an unsigned
>>>> value.  Leave the casting to the caller, or perhaps provide a simple
>>>> wrapper to cast away the sign.  I don't think we are going to miss that
>>>> one extra bit for positive numbers.
>>>>
>>>> So maybe we need the most generic function prototype be be something
>>>> like:
>>>>
>>>> int lprocfs_write_helper(char *buffer, unsigned long count, __s64 *val,
>>>> unsigned int mult, bool units_allowed);
>>>>
>>>> The function comment would explain that if units are allowed, then the
>>>> multiplier is only a default and will be overridden by the
>>>> user-specificed unit.
>>>>
>>>> Chris
>>>>
>>>>> On 09/28/2015 01:08 PM, Di Natale, Giuseppe wrote:
>>>>> Greetings,
>>>>>
>>>>> Recently, I've noticed that the lprocfs write frac int and u64 helpers
>>>>> have a few issues. The biggest issue is that neither function handles
>>>>> overflow/wrap. I also noticed very similar code that should be
>>>>> consolidated down and leveraged by both helpers.
>>>>>
>>>>> I was thinking of refactoring the functions in the fashion described
>>>>> below.
>>>>>
>>>>> int lprocfs_write_frac_helper_internal(char *buffer, unsigned long
>>>>> count, __u64 *val, int mult);
>>>>> int lprocfs_write_frac_u64_helper_safe(const char __user *buffer,
>>>>> unsigned long count, __u64 *val, int mult);
>>>>> int lprocfs_write_frac_helper_safe(const char __user *buffer, unsigned
>>>>> long count, int *val, int mult);
>>>>>
>>>>> lprocfs_write_frac_helper_internal would handle parsing an unsigned long
>>>>> long from the kernel char buffer passed in. It will be responsible for
>>>>> detecting if a uint wrap occurs.
>>>>>
>>>>> For lprocfs_write_frac_u64_helper_safe, the string "-1" will
>>>>> automatically return ULLONG_MAX. If any other string representing a
>>>>> negative number is passed in, an invalid value error code should be
>>>>> returned. If the multiplier is negative, that would also be treated as
>>>>> invalid. The units and multiplier logic can also be consolidated. It
>>>>> will use lprocfs_write_frac_helper_internal to handle the parsing.
>>>>>
>>>>> lprocfs_write_frac_helper_safe will leverage
>>>>> lprocfs_write_frac_helper_internal. If
>>>>> lprocfs_write_frac_helper_internal indicates a wrap occurred, then we
>>>>> also have an invalid integer. Checks for integer overflow happen after a
>>>>> successful call to the internal helper. This is similar to how the
>>>>> current lprocfs_write_frac_helper functions.
>>>>>
>>>>> It is also worth nothing, I plan to maintain the old helpers and their
>>>>> use can be gradually phased out once we are confident the refactored
>>>>> version is doing what it is supposed to.
>>>>>
>>>>> Also, unrelated to the above, quick question about lctl. Is there a
>>>>> particular reason why the setting to be changed when using lctl
>>>>> set_param is echoed back to the user? I think it can be misleading in
>>>>> cases where the value set is not necessarily what is being reflected to
>>>>> the user (i.e. -1 for max value). That could be confusing to a user and
>>>>> they should be using lctl get_param to confirm their value was set
>>>>> anyways. Also, it would follow convention that unless an error happens,
>>>>> nothing is printed to console. Any disagreements on silencing lctl
>>>>> set_param?
>>>>>
>>>>> Thanks,
>>>>> Giuseppe Di Natale
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> lustre-devel mailing list
>>>>> lustre-devel at lists.lustre.org
>>>>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
>>>>>
>>>>
>>>> _______________________________________________
>>>> lustre-devel mailing list
>>>> lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
>>>> _______________________________________________
>>>> lustre-devel mailing list
>>>> lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
>>>> .
>>>>
>>>
>>> .
>>>
>>
>> _______________________________________________
>> lustre-devel mailing list
>> lustre-devel at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> .
>