[lustre-discuss] Error destroying object

Sidiney Crescencio sidiney.crescencio at clustervision.com
Mon May 7 06:21:36 PDT 2018


Hello Andreas,

Have you gotten the opportunity to have a look in the logs that I sent you?

Thanks in advance.

Best Regards,

Sidiney

On Thu, 3 May 2018 at 12:34, Sidiney Crescencio <
sidiney.crescencio at clustervision.com> wrote:

> Hello Andreas,
>
> Thanks for you answer.
>
> [root at storage06 ~]# debugfs -c -R "stat O/0/d$((0x1bfc24c %
> 32))/$((0x1bfc24c))" /dev/mapper/ost001c | grep -i fid
> debugfs 1.42.13.wc6 (05-Feb-2017)
> /dev/mapper/ost001c: catastrophic mode - not reading inode or group bitmaps
>   lma: fid=[0x100000000:0x1bfc2c7:0x0] compat=8 incompat=0
>   fid = "18 93 02 00 0b 00 00 00 3c c2 01 00 00 00 00 00 " (16)
>   fid: parent=[0xb00029318:0x1c23c:0x0] stripe=0
>
>
> [root at node024 ~]# lfs fid2path /lustre/ 0x100000000:0x1bfc2c7:0x0
> ioctl err -22: Invalid argument (22)
> fid2path: error on FID 0x100000000:0x1bfc2c7:0x0: Invalid argument
>
> [root at node024 ~]# lfs fid2path /lustre/ 0xb00029318:0x1c23c:0x0
> fid2path: error on FID 0xb00029318:0x1c23c:0x0: No such file or directory
>
> Am I doing right? I think so, actually looks like the file is already gone
> as I tought in the first moment..
>
> About the hang thread , I've filtered like this and couldn't find nothing
> that might indicate the issue, what else we can check for solve this error?
>
>
>
> [root at storage06 ~]# cat /var/log/messages* | grep -i OST001c | grep -v
> destroying | grep -v scrub
> Apr 30 11:01:13 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to e9153718-f82d-d90b-268a-e8c9a5e3af1c (at 192.168.2.19 at o2ib)
> May  2 15:54:17 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client 9c4b82f6-a2a7-3488-c2b3-cabb9cf333e5 (at 192.168.2.25 at o2ib)
> in 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff8804c451e000, cur 1525269257 expire 1525268357 last 1525267905
> Apr  5 10:11:43 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client c1966b99-1299-9da0-3280-bd6ad84f8f27 (at 192.168.2.51 at o2ib)
> in 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff8804c4519800, cur 1522915903 expire 1522915003 last 1522914551
> Apr  5 10:44:20 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to 7fbdaa81-10cb-2464-f981-883bee1f6fdf (at 192.168.2.21 at o2ib)
> Apr  5 10:59:52 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to aef29b00-0042-9f5e-da17-3bd3b655e13d (at 192.168.2.2 at o2ib)
> Apr  5 11:09:59 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client c4cec4f1-b994-2ad2-be36-196b9f5c1b76 (at 192.168.2.161 at o2ib)
> in 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff88059a0a2400, cur 1522919399 expire 1522918499 last 1522918047
> Apr 14 14:58:02 storage06 kernel: LustreError:
> 0:0:(ldlm_lockd.c:342:waiting_locks_callback()) ### lock callback timer
> expired after 377s: evicting client at 192.168.2.33 at o2ib  ns:
> filter-wurfs-OST001c_UUID lock: ffff880bbf72dc00/0xb64a498f40bc086 lrc:
> 4/0,0 mode: PR/PR res: [0x38ee37e:0x0:0x0].0x0 rrc: 2 type: EXT
> [0->18446744073709551615] (req 0->18446744073709551615) flags:
> 0x60000400010020 nid: 192.168.2.33 at o2ib remote: 0x73aa9e5b8c684dc5
> expref: 328 pid: 39172 timeout: 16574376013 lvb_type: 1
> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: Client
> wurfs-MDT0000-mdtlov_UUID (at 192.168.2.182 at o2ib) reconnecting
> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to 192.168.2.182 at o2ib (at 192.168.2.182 at o2ib)
> Apr 14 15:05:56 storage06 kernel: Lustre: wurfs-OST001c: deleting orphan
> objects from 0x0:59696086 to 0x0:59705564
> Apr 15 15:38:28 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client a21e3dcc-af43-1dc2-b552-ca341a6b5e77 (at 192.168.2.5 at o2ib) in
> 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff880629717000, cur 1523799508 expire 1523798608 last 1523798156
> Apr 15 16:01:07 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client c931b18c-e0cf-4a0c-d95f-9a8cf60f3b3f (at 192.168.2.36 at o2ib)
> in 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff880d3fcfdc00, cur 1523800867 expire 1523799967 last 1523799515
> Apr 15 18:45:35 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client af5f8ac5-fb5d-cd1c-cf97-b755700778bc (at 192.168.2.9 at o2ib) in
> 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff8807fed8d000, cur 1523810735 expire 1523809835 last 1523809383
> Apr 16 09:04:27 storage06 kernel: Lustre:
> 39169:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has
> failed due to network error: [sent 1523862169/real 1523862267]
> req at ffff8809e5746300 x1584854319120496/t0(0)
> o104->wurfs-OST001c at 192.168.2.38@o2ib:15/16 lens 296/224 e 0 to 1 dl
> 1523862736 ref 1 fl Rpc:X/2/ffffffff rc 0/-1
> Apr 16 09:44:18 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to 2e95bceb-837d-5518-9198-48dd0b2b9a83 (at 192.168.2.40 at o2ib)
> Apr 16 09:53:26 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to 07eac249-8012-fe49-1037-3920d06e1403 (at 192.168.2.38 at o2ib)
> Apr 16 09:55:48 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to 3df9306f-8024-c85f-8d42-3ad863a3f4c0 (at 192.168.2.171 at o2ib)
> Apr 16 10:11:25 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to d9a56a18-c51e-2b0c-561d-3b0fa31ca8f7 (at 192.168.2.12 at o2ib)
> Apr 16 10:12:06 storage06 kernel: Lustre: wurfs-OST001c: Connection
> restored to c00bd597-31b4-ded9-fd06-d02500010dad (at 192.168.2.172 at o2ib)
> Apr 16 13:50:44 storage06 kernel: Lustre: wurfs-OST001c: haven't heard
> from client 4d69154c-ca88-ce45-23f7-ff76f1a6423f (at 192.168.2.14 at o2ib)
> in 1352 seconds. I think it's dead, and I am evicting it. exp
> ffff8804c4678800, cur 1523879444 expire 1523878544 last 1523878092
>
> Many thanks.
>
>
> On 2 May 2018 at 20:16, Dilger, Andreas <andreas.dilger at intel.com> wrote:
>
>> This is an OST FID, so you would need to get the parent MDT FID to be
>> able to resolve the pathname.
>>
>> Assuming an ldiskfs OST you can use:
>>
>>     'debugfs -c -R "stat O/0/d$((0x1bfc24c % 32))/$((0x1bfc24c))"
>> LABEL=wurfs-OST001c'
>>
>> To get the parent FID, then "lfs fid2path /mnt/wurfs <FID>" on a client
>> to find the path.
>>
>> That said, the -115 error is "-EINPROGRESS", which means the OST thinks
>> it is already trying to do this. Maybe a hung OST thread?
>>
>> Cheers, Andreas
>>
>> On May 2, 2018, at 06:53, Sidiney Crescencio <
>> sidiney.crescencio at clustervision.com> wrote:
>>
>> Hi All,
>>
>> I need help to discover what file is about this error or how to solve it.
>>
>> Apr 30 13:48:02 storage06 kernel: LustreError:
>> 44779:0:(ofd_dev.c:1884:ofd_destroy_hdl()) wurfs-OST001c: error destroying
>> object [0x1001c0000:0x1bfc24c:0x0]: -115
>>
>> I've been trying to map this to a file but I can't since I don't have the
>> FID
>>
>> Anyone knows how to sort it out?
>>
>> Thanks in advance
>>
>> --
>> Best Regards,
>>
>>
>>
>> Sidiney
>>
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>
>
> --
> Best Regards,
>
> [image: clustervision_logo.png]
> Sidiney Crescencio
> Technical Support Engineer
>
>
> Direct: +31 20 407 7550
> Skype: sidiney.crescencio_1
> sidiney.crescencio at clustervision.com
>
> ClusterVision BV
> Gyroscoopweg 56
> 1042 AC Amsterdam
> The Netherlands
> Tel: +31 20 407 7550
> Fax: +31 84 759 8389
> www.clustervision.com
>
>
>
>
>
>

-- 
Best Regards,

[image: clustervision_logo.png]
Sidiney Crescencio
Technical Support Engineer


Direct: +31 20 407 7550
Skype: sidiney.crescencio_1
sidiney.crescencio at clustervision.com

ClusterVision BV
Gyroscoopweg 56
1042 AC Amsterdam
The Netherlands
Tel: +31 20 407 7550
Fax: +31 84 759 8389
www.clustervision.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180507/f896ed05/attachment.html>


More information about the lustre-discuss mailing list