[lustre-discuss] IOPS monitoring
Dr. Detlev Conrad Mielczarek
detlev.conrad.mielczarek at hu-berlin.de
Mon Apr 13 10:40:46 PDT 2026
Hello Robert, All,
I don't know if this is the best way, but what we currently use at the Humbold University is the following:
- Prometheus
- The lustrefs_exporter from Whamcloud ( https://github.com/whamcloud/lustrefs-exporter )
- node exporter.
- Grafana:
There are a number of grafana dashboards that can visualise the data, however as they are old, quite a few fields need updating and/or adjusting to display desired data.
I don't have a way of definitively verifying the numbers or reports, but the tools seem to "make sense" and work for our use case.
I'm sure a "grafana wizard" could use them for inspiration to create much better dashboards than I managed to get working.
On the plus side, I must say that the Grafana and Prometheus combination is comparatively self-explanatory compared to some other monitoring setups.
It may however be that you need more granularity than this solution can offer.For example I don't think the setup can attribute IOPS to specific jobs.
Kind regards
Detlev
(Sorry for not sending to the list the firet time round - and I fixed a few typos along the way.)
--
Detlev Conrad Mielczarek - HPC Team
(deutsch, english, français)
Humboldt-Universität zu Berlin
Computer- und Medienservice
10099 Berlin, Germany
email: detlev.conrad.mielczarek at hu-berlin.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260413/618811fd/attachment-0001.htm>
More information about the lustre-discuss
mailing list