[lustre-discuss] Files hanging on lustre clients

Kurt Strosahl strosahl at jlab.org
Tue Mar 31 12:43:52 PDT 2020


I can't tell, any commands I run against the files in question hang indefinitely.  It seems very suspicious though.

________________________________
From: Mohr Jr, Richard Frank <rmohr at utk.edu>
Sent: Tuesday, March 31, 2020 3:41 PM
To: Kurt Strosahl <strosahl at jlab.org>
Cc: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>; sciadm at jlab.org <sciadm at jlab.org>
Subject: [EXTERNAL] Re: [lustre-discuss] Files hanging on lustre clients



> On Mar 31, 2020, at 2:36 PM, Kurt Strosahl <strosahl at jlab.org> wrote:
>
> an strace on an ls command run against some of these files produced the following:
> getxattr("/volatile/halld/home/haoli/RunPeriod-2017-01/analysis/ver36_Mar27/log/030408/stdout.030408_124.out", "system.posix_acl_default", NULL, 0) = -1 ENODATA (No data available)
> lstat("/volatile/halld/home/haoli/RunPeriod-2017-01/analysis/ver36_Mar27/log/030408/stderr.030408_118.err", {st_mode=S_IFREG|0644, st_size=16979, ...}) = 0
> getxattr("/volatile/halld/home/haoli/RunPeriod-2017-01/analysis/ver36_Mar27/log/030408/stderr.030408_118.err", "system.posix_acl_access", NULL, 0) = -1 ENODATA (No data available)
> getxattr("/volatile/halld/home/haoli/RunPeriod-2017-01/analysis/ver36_Mar27/log/030408/stderr.030408_118.err", "system.posix_acl_default", NULL, 0) = -1 ENODATA (No data available)
> lstat("/volatile/halld/home/haoli/RunPeriod-2017-01/analysis/ver36_Mar27/log/030408/stdout.030408_000.out",

<snip>

> Lustre: lustre19-OST0028-osc-ffff88105fecd000: Connection to lustre19-OST0028 (at 172.17.0.99 at o2ib) was lost; in progress operations using this service will wait for recovery to complete
> Lustre: lustre19-OST0028-osc-ffff88105fecd000: Connection restored to lustre19-OST0028 (at 172.17.0.99 at o2ib)

Of the files listed in the strace above that gave errors, are all those files striped across OST0028?

—
Rick Mohr
Senior HPC System Administrator
Joint Institute for Computational Sciences
University of Tennessee




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20200331/d3e68123/attachment.html>


More information about the lustre-discuss mailing list