[lustre-discuss] Permanently disabled OST causes clients to hang on df (statfs)

Michael Watters wattersm at watters.ws
Tue Nov 29 12:07:45 PST 2016


Thanks for the info.  I've been able to duplicate this issue and was 
actually about to post a message about the same thing.  :D

I have a small cluster set up with 4 active OSTs and 1 inactive due to 
the volume being replaced.  df takes about 1-2 minutes to run on my 
client note as shown in strace.

14:40:09 statfs("/var/mnt/lustre", {f_type=0xbd00bd0, f_bsize=4096, 
f_blocks=521922232, f_bfree=486861158, f_bavail=459730446, 
f_files=30737207, f_ffree=30551196, f_fsid={743766374, 0}, 
f_namelen=255, f_frsize=4096}) = 0

14:41:49 --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL, 
si_value={int=2447996504, ptr=0x7f5d91e97658}} ---
14:41:49 stat("/var/mnt/lustre", {st_mode=S_IFDIR|0755, st_size=4096, 
...}) = 0
14:41:49 open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 3

Setting lazystatfs to 1 does fix the issue.


On 11/24/2016 04:34 PM, Jerome, Ronald wrote:
>
> Hi,
>
> I’m seeing what appears to be LU-4397 
> <https://jira.hpdd.intel.com/browse/LU-4397> on a newly installed 
> Lustre 2.8.0 (CentOS 7) system.  I disabled an OST on the MGS using 
> lctl conf_param data-OST0001.osc.active=0, and the clients can mount 
> and use the file system, however df on the clients hangs for about 
> 100s before returning the results.  Setting lazystatfs using lctl 
> conf_param data.llite.lazystatfs=1 resolves the issue.
>
> Interestingly, CentOS 6 clients don’t have this issue, only CentOS 7 
> clients.
>
> Also of note, CentOS 6 clients show the OST as Inactive, but the 
> CentOS 7 clients don’t (note data-OST0001 in these two listings).
>
> [root at x92 ~]# lctl dl
>
>   0 UP mgc MGC10.1.0.200 at o2ib 1a4b524d-6171-72a0-670b-58524f5135c9 5
>
>   1 UP lov data-clilov-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 4
>
>   2 UP lmv data-clilmv-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 4
>
>   3 UP mdc data-MDT0000-mdc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
>   4 UP osc data-OST0005-osc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
>   5 UP osc data-OST0002-osc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
>   6 UP osc data-OST0004-osc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
>   7 UP osc data-OST0003-osc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
>   8 IN osc data-OST0001-osc-ffff8804bcaafc00 
> a4e753f1-6495-dfe9-d854-8d3e7305b6e8 5
>
> [root at x87 ~]# lctl dl
>
>   0 UP mgc MGC10.1.0.200 at o2ib e5ab9926-ffa3-2c47-3851-e480303a42cc 5
>
>   1 UP lov data-clilov-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 4
>
>   2 UP lmv data-clilmv-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 4
>
>   3 UP mdc data-MDT0000-mdc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 5
>
>   4 UP osc data-OST0005-osc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 5
>
>   5 UP osc data-OST0002-osc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 5
>
>   6 UP osc data-OST0004-osc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 5
>
>   7 UP osc data-OST0003-osc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 5
>
>   8 UP osc data-OST0001-osc-ffff88031ef85000 
> 213b8ef0-18f9-b7d7-7c43-c8dfb333a11f 4
>
> Ron Jerome
>
> Team Leader, Infrastructure/Operations
>
> Supercomputing Directorate
>
> Shared Services Canada / Government of Canada
>
>
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20161129/a87c2c04/attachment.htm>


More information about the lustre-discuss mailing list