<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

</head>

<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

On Dec 10, 2020, at 21:04, NeilBrown <<a href="mailto:neilb@suse.de" class="">neilb@suse.de</a>> wrote:

<div>

<blockquote type="cite" class="">

<div class="">

<div class="">I've been exploring an alternate approach to preparing lustre code for<br class="">

upstream submission.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

Neil,</div>

<div>sorry for not responding to this email in a timely manner.  I'd read it when</div>

<div>you first posted it, but then thought "I should take time to read it closely</div>

<div>and reply in a thoughtful manner" but it got lost in my inbox.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">As you probably know, the approach we have been following is based on<br class="">

the code that was removed from 'drivers/staging' a couple of years ago.<br class="">

I reverted that removal patch and have ported all the relevant patches<br class="">

from lustre "master" across to that tree, as well as other cleanups and<br class="">

fixes.<br class="">

<br class="">

A problem with this approach is that I find that I don't trust it.<br class="">

I've found multiple examples of patches being ported across<br class="">

incorrectly, of patches being missed, and of bugs introduced while the<br class="">

code was in drivers/staging.<br class="">

I've found and fixed a lot of issues.  I doubt that I've found them<br class="">

all.<br class="">

While I do some testing of this code, as does James, it is minimal<br class="">

compared to the testing done on lustre/master.  And as we should all<br class="">

remember: untested code is buggy code.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

That was my concern with this approach also.  When Greg KH was keeping</div>

<div>it in the staging tree, it ended up getting drive-by patches from all kinds of</div>

<div>developers that ended up adding bugs in patches that weren't reviewed or</div>

<div>tested.</div>

<div><br class="">

</div>

<div>Some time ago, we'd offered to run automated testing for the fs/linux-staging</div>

<div>repo in our autotest, but that essentially requires running everything through</div>

<div>Gerrit, which is not the preferred workflow for upstream kernels.</div>

<div><br class="">

</div>

<div>

<blockquote type="cite" class="">

<div class="">

<div class="">People who use lustre don't only want correctness and performance, they<br class="">

want confidence, and I'm not confident of this code.<br class="">

<br class="">

My alternate approach is to use the code straight out of lustre/master,<br class="">

transformed to match upstream requirements by a relatively simple<br class="">

script.  There is clearly still room for the script to introduce<br class="">

problems, but it is much less likely and any problems introduced are<br class="">

less likely to go unnoticed.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

I think you & James have been doing a great job cleaning up the code in</div>

<div>master to be more in line with the upstream client, and I agree continuing</div>

<div>in that direction (bringing the master into acceptable shape) is the only</div>

<div>feasible way to move forward.</div>

<div><br class="">

</div>

<div>

<blockquote type="cite" class="">

<div class="">

<div class="">I have written such a script, based largely on sed, unifdef, and spatch<br class="">

(the latter from the coccinelle project).<br class="">

<br class="">

Not all changes that I want can be achieved with these tools so I also<br class="">

have a collection of patches to lustre which hopefully will be<br class="">

submitted in due course.  Many are the same sorts of cleanups that I've<br class="">

been submitting for a while.  Others are a bit different.  For example<br class="">

I add a lot more "#ifdef HAVE_SERVER_SUPPORT" preprocessor directives<br class="">

so that "unifdef" can strip out more server-only code.  I've also added<br class="">

some "#ifndef UPSTREAM_LINUX" directives to remove code that cannot go<br class="">

upstream, like the code for extracting a jobid from the environment.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

I know Peng Tao had originally written such a script when he first pushed</div>

<div>the Lustre code into the upstream kernel, but I don't think it was ever</div>

<div>maintained after the initial push.  That would definitely make it more clear</div>

<div>what deltas exist between master and the upstream kernel versions.<br class="">

<div class="">

<div class=""><br class="">

</div>

</div>

<blockquote type="cite" class="">

<div class="">

<div class="">You can see my current lustre tree at<br class="">

  <a href="https://github.com/neilbrown/lustre/tree/lustre-next" class="">https://github.com/neilbrown/lustre/tree/lustre-next</a><br class="">

This contains 119 patches on top of master, some of which have already<br class="">

been submitted to gerrit, others could be easily, others need a lot of<br class="">

work first. The last patch adds the "copy-client" script.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

We are (finally) getting close to releasing 2.14.0 (definitely later than we had</div>

<div>hoped), and that will open up the gates again for the code cleanup patches</div>

<div>to land.  While there have definitely been some bugs introduced from those</div>

<div>patches, like you wrote above they eventually are found with proper testing,</div>

<div>and we benefit from removing many layers of cruft from the old code.</div>

<div>

<div class="">

<div class=""><br class="">

</div>

</div>

<blockquote type="cite" class="">

<div class="">

<div class="">In<br class="">

  <a href="https://github.com/neilbrown/linux/tree/lustre/lustre-next" class="">https://github.com/neilbrown/linux/tree/lustre/lustre-next</a><br class="">

you can find a very recent linux/master tree with two patches added.<br class="">

The first was auto-generated by the script from lustre/lustre-next.<br class="">

The second fixes some compile errors with fscrypt() (though it may<br class="">

actually break the code, I don't know).<br class="">

<br class="">

Note that the copied LNET code is TCP-only.  I left out the o2ib code<br class="">

as I don't think we need that for the first code drop.<br class="">

<br class="">

Apart from providing the code mentioned above, this little project has<br class="">

also helped me form a clearer picture of what needs to be done before<br class="">

upstream submission is worth attempting.  My list includes:<br class="">

<br class="">

- IPv6 support.  I have a lot of patches which add support for extended<br class="">

  NIDs which can store IPv6 addresses.  There is more to be done here<br class="">

  (and thanks to Andreas for his review which I haven't responded to<br class="">

  yet).<br class="">

  One awkwardness in the conversion that I haven't addressed at all yet<br class="">

  is ptlrpc/nodemap_range.c.  This allows a range of IP address to have<br class="">

  a particular nodemap.  This might make sense for IPv4, but makes<br class="">

  somewhat less sense for IPv6.<br class="">

  I'm storing IPv6 addresses in network-byte-order (which is the normal<br class="">

  way to store addresses), and that make ranges particularly cumbersome.<br class="">

  One idea is to use netmasks rather than ranges to define subsets, for<br class="">

  IPv6 at least.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

Could you please file a ticket in Jira to track that issue.  Definitely nodemaps</div>

<div>have become important at some sites for management of multiple clients,</div>

<div>so we don't want to lose that functionality.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">- fscrypt.  There is a comment in lustre-core.m4 which suggests that<br class="">

   lustre does not support file name encryption, and so cannot use<br class="">

   the fscrypt support in Linux.  If I understand that correctly, then<br class="">

   this needs to be rectified.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

This is planned to be fixed in 2.15, it just wasn't ready in time for the feature</div>

<div>cutoff for 2.14.  The 2.14 release only has file data encryption.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">- I want to deprecate libcfs as there isn't anywhere convenient to put<br class="">

  it - as a whole - in the linux kernel. It currently contains a<br class="">

  mixture of things:<br class="">

  = back-compat code, which Linux obviously doesn't needed<br class="">

<br class="">

  - workitem.c and hash.c which are being replaced by<br class="">

    workqueue and rhashables<br class="">

<br class="">

  - cpt code, which is largely used by lnet, so can go in lnet/lnet<br class="">

<br class="">

  - heap tree, which can go in lustre/ptlrpc as it is only used by nrs.<br class="">

<br class="">

  - fail.c error injection.  It would be nice to convert lustre to<br class="">

    use the error-injection in linux/lib.  This is a completely<br class="">

    different user-space interface management.<br class="">

    Is the current interface seen as a "stable api", or is it only<br class="">

    used by lustre/tests (which can obviously be changed).<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

There aren't (or shouldn't be) any users of fail_loc outside of the test</div>

<div>framework.  That said, we do need some kind of interoperability for</div>

<div>testing new/old clients/servers together, so there can't be a wholesale</div>

<div>removal of this functionality.<br class="">

<div class="">

<div class=""><br class="">

</div>

</div>

<blockquote type="cite" class="">

<div class="">

<div class="">  - tracefile/debug.  I think this should be changed to use<br class="">

    tracepoints. I believe the capabilities are similar, though the<br class="">

    details are different.  I think I've raised this before and had an<br class="">

    explanation of why people aren't keen.  Maybe it is time to revisit<br class="">

    that.<br class="">

    LU-8980 was closed "Won't Fix" with no explanatory comment...<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

I think the issue here was that the patches to wrap CDEBUG() with</div>

<div>tracepoints was really unwieldy.  On the flip side, there are a *lot* of</div>

<div>CDEBUG() messages, and converting them all to tracepoints would</div>

<div>be lots of work, since each tracepoint is essentially "hand rolled".</div>

<div>Also, enabling/disabling CDEBUG is easy for users to do for collecting</div>

<div>debug logs to analyze a problem, while the same can't be said for</div>

<div>tracepoints.</div>

<div><br class="">

</div>

<div>IMHO, I don't think there is a requirement to remove CDEBUG() to go</div>

<div>upstream.  Lots of other kernel components have similar debug macros</div>

<div>to enable/disable messages in the kernel.  The big difference is the ring</div>

<div>buffer for tracefile.  If we could use some other pre-existing ring buffer</div>

<div>for that (which was under development at some point, but I don't know</div>

<div>if it ever made it into the kernel) then we could drop the parts that I</div>

<div>think raise objections.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">  - libcfs_string - I've converted nearly all of this to use<br class="">

    functionality already in Linux.  Only cfs_str2mask() remains.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

That is mostly part of CDEBUG/tracefile and could be moved there.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">  - adler32 - hopefully this can be copied to crypto/ in linux.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

Yes, the code is already in the kernel, it just isn't available via CryptoAPI.</div>

<div><br class="">

<blockquote type="cite" class="">

<div class="">

<div class="">- move stuff out of /proc<br class="">

   I understand that part of this is statistics, which don't fit in<br class="">

   debugfs as that is root-only, and don't fit in /sys at that is<br class="">

   one-value-per-file.<br class="">

   Firstly I'd like to resolve everything that is not stats.<br class="">

   James has a netlink proposal for stats.<br class="">

   NFS uses /proc/self/mountstats to present stats info.<br class="">

</div>

</div>

</blockquote>

<div><br class="">

</div>

IMHO, each filesystem/subsystem having its own proc-like virtual</div>

<div>filesystem to export stats/tunables from the kernel is not a win over just</div>

<div>using /proc (probably a net loss because each one implements the same</div>

<div>boilerplate code to "avoid using /proc"), but that seems to have been what</div>

<div>happened in the end.</div>

<div><br class="">

</div>

<div>

<blockquote type="cite" class="">

<div class="">

<div class="">I plan to create some jira issues to cover:<br class="">

 - moving code out of libcfs<br class="">

 - fault injection<br class="">

 - adding more HAVE_SERVER_SUPPORT and UPSTREAM_LINUX directives<br class="">

<br class="">

and start submitting some of the patches I have, and work on some more.<br class="">

<br class="">

I'll probably keep my existing linux-lustre tree (based on<br class="">

drivers/staging) up to date, as there is still valuable stuff in there<br class="">

to be ported across, and being able to 'diff' the two (with filtering)<br class="">

helps.<br class="">

</div>

</div>

</blockquote>

<br class="">

</div>

<div>Yes, keeping track of the deltas is definitely interesting for both tracking what may</div>

<div>still be needed to port to master, as well as the off chance that there may be bugs</div>

<div>fixed in the staging tree that weren't in master.</div>

<br class="">

<div class="">

<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<div>Cheers, Andreas</div>

<div>--</div>

<div>Andreas Dilger</div>

<div>Principal Lustre Architect</div>

<div>Whamcloud</div>

<div><br class="">

</div>

<div><br class="">

</div>

<div><br class="">

</div>

</div>

</div>

</div>

</div>

</div>

<br class="Apple-interchange-newline">

<br class="Apple-interchange-newline">

</div>

<br class="">

</body>

</html>