[lustre-devel] Error checking for llapi_hsm_action_progress().

NeilBrown neilb at suse.de
Mon Aug 31 17:58:02 PDT 2020


On Mon, Aug 31 2020, quentin.bouget at cea.fr wrote:

> Hi Neil,
>
>
> On 31/08/2020 06:03, NeilBrown wrote:
>> I have a question about llapi_hsm_action_progress().  The documentation
>> says that every interval sent "must" be unique, and must not overlap
>> (which not exactly the same as 'unique').  The code (on server side)
>> only partially enforces this.  It causes any request for an empty
>> interval (start>end) to fail, but otherwise accepts any interval.  If it
>> gets two identical intervals (not just overlapping, but identical), it
>> ignores the second.  This seems weird.
>>
>> It would make some sense to just accept any interval - all it does is
>> sum the lengths, and use this to report status, so no corruption would
>> result.  It would also make sense to return an error if an interval
>> overlaps any previous interval, as this violates the spec.  It might
>> make sense to accept any interval, but only count the overlapped length
>> once.  But the current behaviour of only ignoring exact duplicates is
>> weird.  I tried removing that check, but there is a test (hsm_test 108)
>> which checks for repeating identical intervals.
>
> test108 handles a "growing" extent (offset=0, length=1, 2, 3, ...).

test108_progress() in llapi_hsm_test.c sets:
		he.offset = 0;
		he.length = length;
where 'length' is 1000.  It reports progress for this extent 1000 times.
Then complains if the total progress isn't 1000.

> test112 handles acknowledging non-overlapping extents twice each.
Yep.  10 non-overlapping extents, then sends the same 10 again.

>
> I wrote a test to check what happens if you acknowledge overlapping extents:
>
>   * (offset=0, length=256)
>   * (offset=128, length=256)
>
> And surely enough mdt_cdt_get_work_done() returns "512" rather than the 
> expected "384" (ie. 128 + 256).
>
> Even worse, when acknowledging a "shrinking" extent (offset=0, length=N, 
> N - 1, N - 2, ...), only the last value is kept in store.

Does that even mean anything?  It seems to be saying "I've done N amount
of work" and then "Sorry, I lied, I've only done N-1"...

But I'm surprised at your result that only the last is kept.  Reading
the code strongly suggests that it would report the sum of all these
regions.  N^2/2 if you continued to  N-N.

How do you perform these tests?  Is there a command-line tool, or would
I need to write some code?

>
>  From this, I think that exact duplicates are not really ignored, 
> rather, intervals that share the same starting point overwrite one 
> another, until only the last one remains.
> Bug or feature? I don't really know.
>
>
>> I want to clean up this code as I'm converting all users of the lustre
>> interval-tree to use the upstream-linux interval tree code.  What should
>> I do?
>>
>> Should I remove test 108 because it is only testing one particular
>> corner case, or should I improve the code to handle all overlaps
>> consistently?  Would it be OK to fail an overlap (I'd need to change
>> test 108), it must they be quietly accepted?
>
> How does the upstream-linux interval tree compares to Lustre's?

The interesting difference is that the lustre interval-tree refuses to
store exact duplicates (same start, same end), while the Linux
interval-tree will accept any new interval.
I think this made some sense for the first user for interval-trees in
lustre, which was LDLM extent locking, but some other users need to
jump through hoops to handle duplicates correctly.

For hsm_update_work(), it just tries to store the interval it was given
and if that fails, it say "oh well, too bad".

>
> If their behaviours match, there should not be any issue (so far as the 
> current behaviour can be considered issue-free).

Certainly I could preserve exactly the same behaviour, but I find it
hard to write code that doesn't make any sense.
If I *were* to keep exactly the same behaviour as current, I wouldn't use
an interval tree, as (if I understand it correctly), the intervals
aren't really relevant.  It just stores discrete "start+length"
pairs and rejects duplicates.  I'd probably use an rhashtable for that.

>
> Otherwise, I think it would be OK to just assume sending overlapping 
> extents is a programming error and the server does not need to protect 
> itself against it.
> In terms of security, this isn't very good, but the problem already 
> exists, and copytools are supposedly trusted binaries run as root.
> You could then remove test108 as it is itself a programming error, 
> test112 as well, and maybe others.
>
> Just to be sure, you could open a new issue on Jira, and let others rule 
> how much of a bug/feature the whole thing is.

I guess the key question here is: who are those "others" ??
Jira has a "Component/s" field, but it seems to be fixed at "None".
How would I ensure that the Jira issue got to the right people?

Thanks for your response,

NeilBrown


>
> Cheers,
> Quentin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200901/bfbd5ddf/attachment.sig>


More information about the lustre-devel mailing list