[Lustre-discuss] Broken client
Wang Yibin
wang.yibin at oracle.com
Thu Nov 18 06:46:36 PST 2010
Hello,
在 2010-11-18,下午10:03, Herbert Fruchtl 写道:
> I was wrong about only one client having problems. It seems to
> be all of them, except the mds server (see below), so it is a
> problem of the filesystem (not the client) after all.
>
>> Could you elaborate about how "broken" the files are?
>
> When I do an 'ls', the filenames are flashing in red (this is
> for example the case for broken symbolic links). Permissions, date
> and owner are missing, like in the middle of the next three
> lines:
> -rw------- 1 root root 18308319 Jul 16 2009 stat_1247756353.gz
> ?--------- ? ? ? ? ? stat_1248125742.gz
> drwxr-xr-x 2 stephane ukmhd 4096 Jul 8 2009 stephane
>
> Attempting to access the file more closely results in an I/O error:
> [root at mhdc ~]# ls -l /workspace/ls-lR_2009-01-20
> ls: /workspace/ls-lR_2009-01-20: Input/output error
> [root at mhdc ~]# cp /workspace/ls-lR_2009-01-20 /tmp
> cp: cannot stat `/workspace/ls-lR_2009-01-20': Input/output error
This looks very much like some OSTs are failing.
>
>>
>> From your description and the error message you provide, I suspect that one(or some) of the OSTs went down. What does `lctl dl` show?
>>
> The files are accessible from the mds server, and the OSTs seem
> visible from the "broken" clients:
> [root at mhdc ~]# lctl dl
> 0 UP mgc MGC192.168.101.214 at tcp 63568484-f714-da05-c5c2-b96db1b22962 5
> 1 UP lov home-clilov-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 4
> 2 UP mdc home-MDT0000-mdc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 3 UP osc home-OST0001-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 4 UP osc home-OST0003-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 5 UP osc home-OST0002-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 6 UP osc home-OST0005-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 7 UP osc home-OST0004-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
> 8 UP osc home-OST0000-osc-ffff8100d7ecf000 651d7044-988f-f324-6896-3e09edf8a90b 5
>
> Does this help?
I mean 'lctl dl' output on the OSS servers. Make sure that your OSTs are all mounted and running well.
>
> Herbert
>
>> 在 2010-11-18,下午8:18, Herbert Fruchtl 写道:
>>
>>> I have a Lustre (1.6.7) system that looks OKish (as far as I can see) from the
>>> mds and most of the clients. From one client however (the users' login machine)
>>> it looks broken. Some files are missing, some seem broken, and the df command
>>> hangs.
>>>
>>> Rebooting the client doesn't change anything. Is it broken, or is there some
>>> persistent information that I need to flush? When I do an ls on a partially
>>> broken directory, I get the following two lines in /var/log/messages:
>>>
>>> Nov 18 12:13:53 mhdc kernel: [ 7093.751196] LustreError:
>>> 10919:0:(file.c:999:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO
>>> Nov 18 12:13:53 mhdc kernel: [ 7093.761098] LustreError:
>>> 10919:0:(file.c:999:ll_glimpse_size()) Skipped 9 previous similar messages
>>>
>>> Any ideas how to proceed with the least disruption?
>>>
>>> Thanks in advance,
>>>
>>> Herbert
>>> --
>>> Herbert Fruchtl
>>> Senior Scientific Computing Officer
>>> School of Chemistry, School of Mathematics and Statistics
>>> University of St Andrews
>>> --
>>> The University of St Andrews is a charity registered in Scotland:
>>> No SC013532
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
> --
> Herbert Fruchtl
> Senior Scientific Computing Officer
> School of Chemistry, School of Mathematics and Statistics
> University of St Andrews
> --
> The University of St Andrews is a charity registered in Scotland:
> No SC013532
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20101118/99f2502a/attachment.htm>
More information about the lustre-discuss
mailing list