[lustre-discuss] NRS TBF by UID and congestion
Moreno Diego (ID SIS)
diego.moreno at id.ethz.ch
Thu Oct 14 12:33:08 PDT 2021
Hi Lustre friends,
I'm wondering if someone has experience setting NRS TBF (by UID) on the OSTs (ost_io and ost service) in order to avoid congestion of the filesystem IOPS or bandwidth. All my tries during the last months have miserably failed into something that doesn’t look like QoS when the system has a high load. Once the system is under high load not even the TBF UID policy is saving us from slow response times for any user. So far, I have only tried setting it by UID so every user has their fair share of bandwidth. I tried different rate values for the default rule (5'000, 1'000 or 500). We have Lustre 2.12 in our cluster.
Maybe there's any other setting that needs throttling (I see a parameter /sys/module/ptlrpc/parameters/tbf_rate that I could not find documented set to 10'000), is there anything I'm missing about this feature?
More information about the lustre-discuss