[lustre-devel] lctl debug_kernel: usability improvements?

Bertschinger, Thomas Andrew Hjorth bertschinger at lanl.gov
Mon Apr 10 07:19:36 PDT 2023


Hello,

Recently when using "lctl dk" I have found myself wanting some of the "quality of life" features that exist in the similar linux tool dmesg. In particular, having the ability to "follow" the debug log like "dmesg -w" would be very handy IMO.

I've attempted to implement this in userspace with the existing tooling (using "lctl debug_daemon" to write the encoded log to a file, and "lctl debug_file" to decode it) but have run into challenges. I first tried creating a FIFO and had debug_daemon write to it and debug_file read from it. Unfortunately this fails because the kernel thread that writes to this file (tracefiled in libcfs/libcfs/tracefile.c) repeatedly opens and closes the file, but after the first close reading the FIFO fails.

My next idea was to have debug_daemon write to a regular file and debug_file read it like "tail -f". This should work in theory but has disadvantages: the user must remember to delete the file when done (the tool could do this but not if it exits uncleanly), and also entries could be missed if the file is deleted while the tool is still running.

I think the cleanest solution is to rework the debug_kernel interface to be like linux's /dev/kmsg. A character device, perhaps named /dev/lmsg, could be created that outputs the buffer contents when read. Implementing "follow" would be trivial with this interface. The existing userspace tools could also easily be updated to use this interface, and it would bring other benefits, for example "lctl dk" not needing to copy the message buffer to a tmp file. The disadvantage here is that this could be a significant kernel-side refactor.

I feel the ability to follow Lustre's debug log would be useful to both sysadmins and developers but want to get some other input. Would this be valuable to anyone? If this would be useful -- and feasible -- I would be happy to submit a JIRA ticket and work on a patch but wanted to get some more opinions. I'm not very familiar with the kernel side code yet so I'm not sure how complicated this would be.

- Thomas Bertschinger
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20230410/d28325f1/attachment.htm>


More information about the lustre-devel mailing list