[lustre-devel] lctl debug_kernel: usability improvements?

Bertschinger, Thomas Andrew Hjorth bertschinger at lanl.gov
Mon Apr 17 07:55:51 PDT 2023


After looking into this more I see there is a fair challenge associated with this idea. With each CPU (and execution context) having its own distinct buffer for messages (struct cfs_trace_cpu_data), the messages are not chronologically sorted in kernel memory. Instead, they are written to a regular file in CPU order and then the sorted chronologically in userspace prior to printing.

Implementing a  "/dev/lmsg" device would be challenging with the existing data structures because sorting would have to happen in kernel space as the device responds to reads.

I found this issue: LU-14428 "Convert tracefile to use ring_buffer from linux" which does not look to be completed, seeing as a ring_buffer is not currently in use here -- but if this is completed, implementing "/dev/lmsg" with the same interface as "/dev/kmsg" would be much simpler. (Assuming I understand correctly that the proposal is a single global ring buffer. Let me know if I am mistaken and the proposal is a set of per-CPU ring buffers, because then the sorting problem is not avoided.)

I reported a new issue LU-16746 "Convert tracefile to export debug logs via character device" for this idea.This can be worked on after LU-14428 is completed. If I can be of assistance on LU-14428 by helping with any sub-tasks, let me know. I am interested in helping with this area of Lustre.

Thanks,

Thomas Bertschinger

________________________________
From: lustre-devel <lustre-devel-bounces at lists.lustre.org> on behalf of Bertschinger, Thomas Andrew Hjorth via lustre-devel <lustre-devel at lists.lustre.org>
Sent: Monday, April 10, 2023 8:19 AM
To: lustre-devel at lists.lustre.org
Subject: [lustre-devel] lctl debug_kernel: usability improvements?


Hello,

Recently when using "lctl dk" I have found myself wanting some of the "quality of life" features that exist in the similar linux tool dmesg. In particular, having the ability to "follow" the debug log like "dmesg -w" would be very handy IMO.

I've attempted to implement this in userspace with the existing tooling (using "lctl debug_daemon" to write the encoded log to a file, and "lctl debug_file" to decode it) but have run into challenges. I first tried creating a FIFO and had debug_daemon write to it and debug_file read from it. Unfortunately this fails because the kernel thread that writes to this file (tracefiled in libcfs/libcfs/tracefile.c) repeatedly opens and closes the file, but after the first close reading the FIFO fails.

My next idea was to have debug_daemon write to a regular file and debug_file read it like "tail -f". This should work in theory but has disadvantages: the user must remember to delete the file when done (the tool could do this but not if it exits uncleanly), and also entries could be missed if the file is deleted while the tool is still running.

I think the cleanest solution is to rework the debug_kernel interface to be like linux's /dev/kmsg. A character device, perhaps named /dev/lmsg, could be created that outputs the buffer contents when read. Implementing "follow" would be trivial with this interface. The existing userspace tools could also easily be updated to use this interface, and it would bring other benefits, for example "lctl dk" not needing to copy the message buffer to a tmp file. The disadvantage here is that this could be a significant kernel-side refactor.

I feel the ability to follow Lustre's debug log would be useful to both sysadmins and developers but want to get some other input. Would this be valuable to anyone? If this would be useful -- and feasible -- I would be happy to submit a JIRA ticket and work on a patch but wanted to get some more opinions. I'm not very familiar with the kernel side code yet so I'm not sure how complicated this would be.

- Thomas Bertschinger
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20230417/4f6211d4/attachment.htm>


More information about the lustre-devel mailing list