[lustre-discuss] Understanding MDT getxattr stats

Kirk, Benjamin (JSC-EG311) benjamin.kirk at nasa.gov
Tue Sep 25 14:01:50 PDT 2018


Hi all,

We’re using jobstats under SLURM and have pulled together a tool to integrate SLURM job info and lustre OST/MDT jobstats.  The idea is to correlate filesystem use cases with particular applications as targets for refactoring.

In doing so, I’m seeing some applications really trigger getxattr on the MDT, and others do not.  A particular egregious example is below:  360 cores, ~10s of GB output, ~6500 files, but 16,608,476 calls to getxattr during a 4 hour runtime. And this is a nominally compute-bound problem, so that I/O pattern is likely compressed into small windows of time.

The system is CentOS 7.5 / lustre 2.10.5 / zfs-0.7.9, single mdt, 12 OSS, 2 OST each.  Default stripe count of 4.

A couple questions:

1) should I care about this?  We do see sporadic mdt slowness under zfs, but that doesn’t seem rare.  I’m looking for a good way to trace that to jobs / use cases.
2) what types of operations might be triggering the getxattr usage on a moderate amount of files (e.g. what to watch for in the refactoring process…)

Thanks,

-Ben

--------------------------
….
TRES                   : cpu=360,node=30,billing=360
RunTime                : 04:59:14
GroupId                : eg3(3000)
ExitCode               : 0:0
MDT:rename             : 373
MDT:snapshot_time      : 2018-09-21 08:36:29
MDT:setattr            : 444
MDT:mkdir              : 361
MDT:getattr            : 1570
MDT:getxattr           : 16608476
MDT:mknod              : 265
MDT:rmdir              : 1
MDT:samedir_rename     : 373
MDT:close              : 6331
MDT:unlink             : 113
MDT:open               : 6345
OST0009:write_bytes    : 3.46 GB
OST0008:write_bytes    : 3.11 GB
OST0001:write_bytes    : 1.01 GB
OST0000:write_bytes    : 396.19 MB
OST0005:read_bytes     : 8.19 KB
OST0005:write_bytes    : 2.38 GB
OST0005:setattr        : 1
OST0004:write_bytes    : 790.65 MB
OST0007:write_bytes    : 3.02 GB
OST0006:write_bytes    : 817.14 MB
OST0016:write_bytes    : 4.57 GB
OST0017:write_bytes    : 5.15 GB
OST0017:setattr        : 1
OST0014:write_bytes    : 8.8 GB
OST0015:write_bytes    : 1.37 GB
OST0012:write_bytes    : 7 GB
OST0012:setattr        : 1
OST0013:read_bytes     : 8.39 MB
OST0013:write_bytes    : 8.4 GB
OST0013:setattr        : 1
OST0010:write_bytes    : 1.98 GB
OST0011:read_bytes     : 27.28 MB
OST0011:write_bytes    : 9.42 GB
OST000c:read_bytes     : 131.07 KB
OST000c:write_bytes    : 5.83 GB
OST000c:setattr        : 2
OST000b:read_bytes     : 28.12 MB
OST000b:write_bytes    : 4.23 GB
OST000e:read_bytes     : 8.02 MB
OST000e:write_bytes    : 7.48 GB
OST000e:setattr        : 1
OST000d:write_bytes    : 1.21 GB
OST000f:write_bytes    : 2.88 GB


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180925/09359dd9/attachment.html>


More information about the lustre-discuss mailing list