[lustre-discuss] Per-client I/O Operation Counters

Dilger, Andreas andreas.dilger at intel.com
Fri Jun 2 07:01:26 PDT 2017


On Jun 1, 2017, at 19:34, Russell Dekema <dekemar at umich.edu> wrote:
> 
> Greetings,
> 
> Is there a way, either on the Lustre clients or (preferably) OSSes, to
> determine how many I/O operations each Lustre client is performing
> against the filesystem?
> 
> I know several ways of finding the number of *bytes* read or written
> by a client (or even on a per-job basis with job_stats), but we
> suspect we have some clients overwhelming our filesystem with large
> numbers of small I/O requests, and I don't know how to find per-client
> (or per-job) I/O operation counters.

On each OSS, there are per-client statistics via "lctl get_param obdfilter.*.exports.*.stats"
(/proc/fs/lustre/obdfilter/$fsname-OST*/exports/{client NID}/stats).  Of particular interest are the "read_bytes" and "write_bytes" stats:

$ lctl get_param obdfilter.*.exports.*.stats | egrep "=|_bytes"
obdfilter.testfs-OST0000.exports.0 at lo.stats=
write_bytes               6 samples [bytes] 4096 1048576 1069056
obdfilter.testfs-OST0001.exports.0 at lo.stats=
read_bytes                1 samples [bytes] 4096 4096 4096
write_bytes               122 samples [bytes] 6 202 928
obdfilter.testfs-OST0002.exports.0 at lo.stats=
obdfilter.testfs-OST0003.exports.0 at lo.stats=
write_bytes               122 samples [bytes] 6 6 732

This shows 1 read RPC, and 250 write RPCs, and the min, max, sum of bytes on a per-client basis.  You would be most interested in the number of RPCs (first numeric column), but you can also work out the mean RPC size for each client by dividing the last column (sum) by the RPC count.  

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation









More information about the lustre-discuss mailing list