[lustre-discuss] FID used by two objects

Dilger, Andreas andreas.dilger at intel.com
Sat Jul 22 12:08:16 PDT 2017


On Jul 17, 2017, at 22:48, wanglu <wanglu at ihep.ac.cn> wrote:
> 
> Hello, 
> 
> One OST of our system can not be mounted in lustre mode after an severe disk error and an 5 days' e2fsck.  Here are errors we got during the mount operation.
> #grep FID /var/log/messages
> Jul 17 20:15:21 oss04 kernel: LustreError: 13089:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48085/1708371613
> Jul 17 20:38:41 oss04 kernel: LustreError: 13988:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48086/3830163079
> Jul 17 20:49:55 oss04 kernel: LustreError: 14221:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48087/538285899
> Jul 18 11:39:25 oss04 kernel: LustreError: 31071:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48088/2468309129
> Jul 18 11:39:56 oss04 kernel: LustreError: 31170:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48089/2021195118
> Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248

The numbers printed here are ldiskfs inode numbers, 86 and 48090.  The FID [0x200000005:0x1:0x0] is the user quota file, so these files may be in the quota_slave directory.

> and the mount operation is failed with error -17
> Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x200000005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248
> Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(qsd_lib.c:418:qsd_qtype_init()) lustre-OST0036: can't open slave index copy [0x200000006:0x20000:0x0] -17
> Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(obd_mount_server.c:1723:server_fill_super()) Unable to start targets: -17
> Jul 18 12:04:31 oss04 kernel: Lustre: Failing over lustre-OST0036
> Jul 18 12:04:32 oss04 kernel: Lustre: server umount lustre-OST0036 complete
> 
> If you run e2fsck again, the command will claim that the inode 480xx has two reference and remove 480xxx to Lost+Found. 
> # e2fsck -f /dev/sdn 
> e2fsck 1.42.12.wc1 (15-Sep-2014)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Unattached inode 48090
> Connect to /lost+found<y>? yes
> Inode 48090 ref count is 2, should be 1.  Fix<y>? yes
> Pass 5: Checking group summary information
> 
> lustre-OST0036: ***** FILE SYSTEM WAS MODIFIED *****
> lustre-OST0036: 238443/549322752 files (4.4% non-contiguous), 1737885841/2197287936 blocks
> 
> Is it possible to find the file corresponding to 86/3303188178 and delete it ?

You could just delete the 48090 file from lost+found (or move it out of the Lustre filesystem for backup) and it should solve the problem.

> P.S  1. in ldiskfs mode,  most of the disk files are OK to read, while some of them are red. 
>        2.  there are about 240'000 objects in the OST. 
>   [root at oss04 d0]# df -i /lustre/ostc
> Filesystem        Inodes  IUsed     IFree IUse% Mounted on
> /dev/sdn       549322752 238443 549084309    1% /lustre/ostc
>        3.  Lustre Version 2.5.3,  e2fsprog version 

This is an old version of Lustre and e2fsprogs, you would be much better off to upgrade.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation









More information about the lustre-discuss mailing list