[lustre-devel] Lustre upstream client TODO list

James Simmons jsimmons at infradead.org
Sun Feb 11 15:17:03 PST 2018


So I sent a patch upstream that laid out what most needs to be done for
the linux lustre client to leave staging. I placed the new text here for
ease of read so you don't have to go searching for it. Feed back is
welcomed. Hoepfully posting it will make it clear what needs to be done.

Currently all the work directed toward the lustre upstream client is tracked
at the following link:

https://jira.hpdd.intel.com/browse/LU-9679

Under this ticket you will see the following work items that need to be
addressed:

******************************************************************************
* libcfs cleanup
*
* https://jira.hpdd.intel.com/browse/LU-9859
*
* Track all the cleanups and simplification of the libcfs module. Remove
* functions the kernel provides. Possible intergrate some of the functionality
* into the kernel proper.
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-100086

LNET_MINOR conflicts with USERIO_MINOR

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8130

Fix and simplify libcfs hash handling

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8703

The current way we handle SMP is wrong. Platforms like ARM and KNL can have
core and NUMA setups with things like NUMA nodes with no cores. We need to
handle such cases. This work also greatly simplified the lustre SMP code.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9019

Replace libcfs time API with standard kernel APIs. Also migrate away from
jiffies. We found jiffies can vary on nodes which can lead to corner cases
that can break the file system due to nodes having inconsistent behavior.
So move to time64_t and ktime_t as much as possible.

******************************************************************************
* Proper IB support for ko2iblnd
******************************************************************************
https://jira.hpdd.intel.com/browse/LU-9179

Poor performance for the ko2iblnd driver. This is related to many of the
patches below that are missing from the linux client.
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9886

Crash in upstream kiblnd_handle_early_rxs()
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089

Default to default to using MEM_REG
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10459

throttle tx based on queue depth
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9943

correct WR fast reg accounting
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10291

remove concurrent_sends tunable
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10213

calculate qp max_send_wrs properly
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9810

use less CQ entries for each connection
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180

rework map_on_demand behavior
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10129

query device capabilities
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10015

fix race at kiblnd_connect_peer
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9983

allow for discontiguous fragments
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9500

Don't Page Align remote_addr with FastReg
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9448

handle empty CPTs
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9507

Don't Assert On Reconnect with MultiQP
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9472

Fix FastReg map/unmap for MLX5
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9425

Turn on 2 sges by default
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8943

Enable Multiple OPA Endpoints between Nodes
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-5718

multiple sges for work request
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9094

kill timedout txs from ibp_tx_queue
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9094

reconnect peer for REJ_INVALID_SERVICE_ID
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8752

Stop MLX5 triggering a dump_cqe
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8874

Move ko2iblnd to latest RDMA changes
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874

Change to new RDMA done callback mechanism

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874

Incorporate RDMA map/unamp API's into ko2iblnd

******************************************************************************
* sysfs/debugfs fixes
*
* https://jira.hpdd.intel.com/browse/LU-8066
*
* The original migration to sysfs was done in haste without properly working
* utilities to test the changes. This covers the work to restore the proper
* behavior. Huge project to make this right.
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-9431

The function class_process_proc_param was used for our mass updates of proc
tunables. It didn't work with sysfs and it was just ugly so it was removed.
In the process the ability to mass update thousands of clients was lost. This
work restores this in a sane way.

------------------------------------------------------------------------------
https://jira.hpdd.intel.com/browse/LU-9091

One the major request of users is the ability to pass in parameters into a
sysfs file in various different units. For example we can set max_pages_per_rpc
but this can vary on platforms due to different platform sizes. So you can
set this like max_pages_per_rpc=16MiB. The original code to handle this written
before the string helpers were created so the code doesn't follow that format
but it would be easy to move to. Currently the string helpers does the reverse
of what we need, changing bytes to string. We need to change a string to bytes.

******************************************************************************
* Proper user land to kernel space interface for Lustre
*
* https://jira.hpdd.intel.com/browse/LU-9680
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-8915

Don't use linux list structure as user land arguments for lnet selftest.
This code is pretty poor quality and really needs to be reworked.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8834

The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
other file systems with similar functionality and make a common syscall
interface or rework our server code to automagically do it for us.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-6202

Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
ioctls can be changed over to netlink. This also has the benefit of working
better with HPC systems that do IO forwarding. Such systems don't like ioctls
very well.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9667

More cleanups by making our utilities use sysfs instead of ioctls for LNet.
Also it has been requested to move the remaining ioctls to the netlink API.

******************************************************************************
* Misc
******************************************************************************

------------------------------------------------------------------------------
https://jira.hpdd.intel.com/browse/LU-9855

Clean up obdclass preprocessor code. One of the major eye sores is the various
pointer redirections and macros used by the obdclass. This makes the code very
difficult to understand. It was requested by the Al Viro to clean this up before
we leave staging.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9633

Migrate to sphinx kernel-doc style comments. Add documents in Documentation.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-6142

Possible remaining coding style fix. Remove deadcode. Enforce kernel code
style. Other minor misc cleanups...

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8837

Separate client/server functionality. Functions only used by server can be
removed from client. Most of this has been done but we need a inspect of the
code to make sure.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8964

Lustre client readahead/writeback control needs to better suit kernel providings.
Currently its being explored. We could end up replacing the CLIO read ahead
abstract with the kernel proper version.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9862

Patch that landed for LU-7890 leads to static checker errors
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9868

dcache/namei fixes for lustre
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10467

use standard linux wait_events macros work by Neil Brown

------------------------------------------------------------------------------


More information about the lustre-devel mailing list