[lustre-discuss] jobstats

Andrew Elwell andrew.elwell at gmail.com
Fri May 27 02:11:51 PDT 2022


Hi folks,

I've finally started to re-investigate pushing jobstats to our central
dashboards and realised there's a dearth of scripts / tooling to
actually gather the job_stats files and push them to $whatever. I have
seen the telegraf one, and the DDN fork of collectd seems somewhat
abandonware. Hence at this stage I'm back to rolling another Python
script to feed influxdb. Yes, I know all the cool kids are using
prometheus, but I'm not one of them.

However while rummaging I came across LU-11407 (Improve stats data) -
Andreas commented[1] he was hoping to add start_time and elapsed_time
fields, but are these targeted in an upcoming release (still shows
'open') - It's also referred to in LU-15826 - Is that likely to make a
point release of 2.15 or will it be the targeted at the next major
release? It might be handy to save me correlating with slurm job start
times, especially if the user job does $other_stuff before actually
hitting the disks.


Many thanks

Andrew




[1] https://jira.whamcloud.com/browse/LU-11407?focusedCommentId=234830&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-234830


More information about the lustre-discuss mailing list