[lustre-discuss] Error destroying object

Sidiney Crescencio sidiney.crescencio at clustervision.com
Fri Jun 1 08:42:35 PDT 2018


Hello Andreas,

I haven't heard back from you, could you please advise in this case? As I'm
not able to find any file so I don't know how to proceed.

Thanks.


On Mon, 7 May 2018 at 15:21, Sidiney Crescencio <
sidiney.crescencio at clustervision.com> wrote:

> Hello Andreas,
>
> Have you gotten the opportunity to have a look in the logs that I sent you?
>
> Thanks in advance.
>
> Best Regards,
>
> Sidiney
>
> On Thu, 3 May 2018 at 12:34, Sidiney Crescencio <
> sidiney.crescencio at clustervision.com> wrote:
>
>> Hello Andreas,
>>
>> Thanks for you answer.
>>
>> [root at storage06 ~]# debugfs -c -R "stat O/0/d$((0x1bfc24c %
>> 32))/$((0x1bfc24c))" /dev/mapper/ost001c | grep -i fid
>> debugfs 1.42.13.wc6 (05-Feb-2017)
>> /dev/mapper/ost001c: catastrophic mode - not reading inode or group
>> bitmaps
>>   lma: fid=[0x100000000:0x1bfc2c7:0x0] compat=8 incompat=0
>>   fid = "18 93 02 00 0b 00 00 00 3c c2 01 00 00 00 00 00 " (16)
>>   fid: parent=[0xb00029318:0x1c23c:0x0] stripe=0
>>
>>
>> [root at node024 ~]# lfs fid2path /lustre/ 0x100000000:0x1bfc2c7:0x0
>> ioctl err -22: Invalid argument (22)
>> fid2path: error on FID 0x100000000:0x1bfc2c7:0x0: Invalid argument
>>
>> [root at node024 ~]# lfs fid2path /lustre/ 0xb00029318:0x1c23c:0x0
>> fid2path: error on FID 0xb00029318:0x1c23c:0x0: No such file or directory
>>
>> Am I doing right? I think so, actually looks like the file is already
>> gone as I tought in the first moment..
>>
>> About the hang thread , I've filtered like this and couldn't find nothing
>> that might indicate the issue, what else we can check for solve this error?
>>
>>
>>
>> [root at storage06 ~]# cat /var/log/messages* | grep -i OST001c | grep -v
>> destroying | grep -v scrub
>> Apr 30 11:01:13 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to e9153718-f82d-d90b-268a-e8c9a5e3af1c (at 192.168.2.19 at o2ib)
>> May  2 15:54:17 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client 9c4b82f6-a2a7-3488-c2b3-cabb9cf333e5 (at 192.168.2.25 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff8804c451e000, cur 1525269257 expire 1525268357 last 1525267905
>> Apr  5 10:11:43 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client c1966b99-1299-9da0-3280-bd6ad84f8f27 (at 192.168.2.51 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff8804c4519800, cur 1522915903 expire 1522915003 last 1522914551
>> Apr  5 10:44:20 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to 7fbdaa81-10cb-2464-f981-883bee1f6fdf (at 192.168.2.21 at o2ib)
>> Apr  5 10:59:52 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to aef29b00-0042-9f5e-da17-3bd3b655e13d (at 192.168.2.2 at o2ib)
>> Apr  5 11:09:59 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client c4cec4f1-b994-2ad2-be36-196b9f5c1b76 (at 192.168.2.161 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff88059a0a2400, cur 1522919399 expire 1522918499 last 1522918047
>> Apr 14 14:58:02 storage06 kernel: LustreError:
>> 0:0:(ldlm_lockd.c:342:waiting_locks_callback()) ### lock callback timer
>> expired after 377s: evicting client at 192.168.2.33 at o2ib  ns:
>> filter-wurfs-OST001c_UUID lock: ffff880bbf72dc00/0xb64a498f40bc086 lrc:
>> 4/0,0 mode: PR/PR res: [0x38ee37e:0x0:0x0].0x0 rrc: 2 type: EXT
>> [0->18446744073709551615] (req 0->18446744073709551615) flags:
>> 0x60000400010020 nid: 192.168.2.33 at o2ib remote: 0x73aa9e5b8c684dc5
>> expref: 328 pid: 39172 timeout: 16574376013 lvb_type: 1
>> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: Client
>> wurfs-MDT0000-mdtlov_UUID (at 192.168.2.182 at o2ib) reconnecting
>> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to 192.168.2.182 at o2ib (at 192.168.2.182 at o2ib)
>> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: deleting orphan
>> objects from 0x0:59696086 to 0x0:59705564
>> Apr 15 15:38:28 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client a21e3dcc-af43-1dc2-b552-ca341a6b5e77 (at 192.168.2.5 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff880629717000, cur 1523799508 expire 1523798608 last 1523798156
>> Apr 15 16:01:07 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client c931b18c-e0cf-4a0c-d95f-9a8cf60f3b3f (at 192.168.2.36 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff880d3fcfdc00, cur 1523800867 expire 1523799967 last 1523799515
>> Apr 15 18:45:35 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client af5f8ac5-fb5d-cd1c-cf97-b755700778bc (at 192.168.2.9 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff8807fed8d000, cur 1523810735 expire 1523809835 last 1523809383
>> Apr 16 09:04:27 storage06 kernel: Lustre:
>> 39169:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1523862169/real 1523862267]
>> req at ffff8809e5746300 x1584854319120496/t0(0)
>> o104->wurfs-OST001c at 192.168.2.38@o2ib:15/16 lens 296/224 e 0 to 1 dl
>> 1523862736 ref 1 fl Rpc:X/2/ffffffff rc 0/-1
>> Apr 16 09:44:18 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to 2e95bceb-837d-5518-9198-48dd0b2b9a83 (at 192.168.2.40 at o2ib)
>> Apr 16 09:53:26 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to 07eac249-8012-fe49-1037-3920d06e1403 (at 192.168.2.38 at o2ib)
>> Apr 16 09:55:48 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to 3df9306f-8024-c85f-8d42-3ad863a3f4c0 (at 192.168.2.171 at o2ib)
>> Apr 16 10:11:25 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to d9a56a18-c51e-2b0c-561d-3b0fa31ca8f7 (at 192.168.2.12 at o2ib)
>> Apr 16 10:12:06 storage06 kernel: Lustre: wurfs-OST001c: Connection
>> restored to c00bd597-31b4-ded9-fd06-d02500010dad (at 192.168.2.172 at o2ib)
>> Apr 16 13:50:44 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
>> from client 4d69154c-ca88-ce45-23f7-ff76f1a6423f (at 192.168.2.14 at o2ib)
>> in 1352 seconds. I think it's dead, and I am evicting it. exp
>> ffff8804c4678800, cur 1523879444 expire 1523878544 last 1523878092
>>
>> Many thanks.
>>
>>
>> On 2 May 2018 at 20:16, Dilger, Andreas <andreas.dilger at intel.com> wrote:
>>
>>> This is an OST FID, so you would need to get the parent MDT FID to be
>>> able to resolve the pathname.
>>>
>>> Assuming an ldiskfs OST you can use:
>>>
>>>     'debugfs -c -R "stat O/0/d$((0x1bfc24c % 32))/$((0x1bfc24c))"
>>> LABEL=wurfs-OST001c'
>>>
>>> To get the parent FID, then "lfs fid2path /mnt/wurfs <FID>" on a client
>>> to find the path.
>>>
>>> That said, the -115 error is "-EINPROGRESS", which means the OST thinks
>>> it is already trying to do this. Maybe a hung OST thread?
>>>
>>> Cheers, Andreas
>>>
>>> On May 2, 2018, at 06:53, Sidiney Crescencio <
>>> sidiney.crescencio at clustervision.com> wrote:
>>>
>>> Hi All,
>>>
>>> I need help to discover what file is about this error or how to solve it.
>>>
>>> Apr 30 13:48:02 storage06 kernel: LustreError:
>>> 44779:0:(ofd_dev.c:1884:ofd_destroy_hdl()) wurfs-OST001c: error destroying
>>> object [0x1001c0000:0x1bfc24c:0x0]: -115
>>>
>>> I've been trying to map this to a file but I can't since I don't have
>>> the FID
>>>
>>> Anyone knows how to sort it out?
>>>
>>> Thanks in advance
>>>
>>> --
>>> Best Regards,
>>>
>>>
>>>
>>> Sidiney
>>>
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>>>
>>
>>
>> --
>> Best Regards,
>>
>> [image: clustervision_logo.png]
>> Sidiney Crescencio
>> Technical Support Engineer
>>
>>
>> Direct: +31 20 407 7550
>> Skype: sidiney.crescencio_1
>> sidiney.crescencio at clustervision.com
>>
>> ClusterVision BV
>> Gyroscoopweg 56
>> 1042 AC Amsterdam
>> The Netherlands
>> Tel: +31 20 407 7550
>> Fax: +31 84 759 8389
>> www.clustervision.com
>>
>>
>>
>>
>>
>>
>
> --
> Best Regards,
>
> [image: clustervision_logo.png]
> Sidiney Crescencio
> Technical Support Engineer
>
>
> Direct: +31 20 407 7550
> Skype: sidiney.crescencio_1
> sidiney.crescencio at clustervision.com
>
> ClusterVision BV
> Gyroscoopweg 56
> 1042 AC Amsterdam
> The Netherlands
> Tel: +31 20 407 7550
> Fax: +31 84 759 8389
> www.clustervision.com
>
>
>
>
>
>

-- 
Best Regards,

[image: clustervision_logo.png]
Sidiney Crescencio
Technical Support Engineer


Direct: +31 20 407 7550
Skype: sidiney.crescencio_1
sidiney.crescencio at clustervision.com

ClusterVision BV
Gyroscoopweg 56
1042 AC Amsterdam
The Netherlands
Tel: +31 20 407 7550
Fax: +31 84 759 8389
www.clustervision.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180601/63652c2d/attachment-0001.html>


More information about the lustre-discuss mailing list