[lustre-discuss] Contents of a specific directory cannot be seen, but the directory can be written into.

Kurt Strosahl strosahl at jlab.org
Fri Jan 31 11:17:10 PST 2020


Good afternoon,

   I've come across a rather vexing problem within one of my lustre file systems.  A directory whose contents can't be viewed, but into which writes can take place.  Attempting to ls into that directory hangs, but lctl getstripe still works.

After attempting to look in the directory the node displays the following, even after the ls is cancelled.
[4498716.485619] Lustre: 18859:0:(client.c:2116:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1580497199/real 1580497199]  req at ffff9129f7c78c00 x1652557100931648/t0(0) o101->lustre19-MDT0000-mdc-ffff91091289f000 at 172.17.0.36@o2ib:12/10 lens 696/33584 e 24 to 1 dl 1580497800 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
[4498716.485642] Lustre: lustre19-MDT0000-mdc-ffff91091289f000: Connection to lustre19-MDT0000 (at 172.17.0.36 at o2ib) was lost; in progress operations using this service will wait for recovery to complete
[4498716.486114] Lustre: lustre19-MDT0000-mdc-ffff91091289f000: Connection restored to 172.17.0.36 at o2ib (at 172.17.0.36 at o2ib)

Since the issue started more files have been written into the directory, but none of them can be read.

Further, since the issue began the metadata server has been generating lustre-logs a few times a day.

I'm running luster 2.12.1 with zfs on the metadata system (and the osts) on CentOS 7.6

w/r,

Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20200131/cf092274/attachment.html>


More information about the lustre-discuss mailing list