[Lustre-discuss] lvbo_init failed after e2fsck

WANG Lu wanglu at ihep.ac.cn
Tue Aug 2 01:38:47 PDT 2011


Update some information:

1. After running "ll_recover_lost_found_objs", we still have the "lvbo_init faild" error.
2. There is no files under "lost+found", in some of related OSTs. Here is the result of debugf:

# debugfs -c /dev/sdb1
debugfs 1.41.10.sun2 (24-Feb-2010)
/dev/sdb1: catastrophic mode - not reading inode or group bitmaps
debugfs:  ls
 2  (12) .    2  (12) ..    11  (20) lost+found    103784449  (16) CONFIGS   
 12  (20) last_rcvd    13  (20) health_check    222863361  (3996) O   
debugfs:  cd lost+found     
debugfs:  ls
 11  (12) .    2  (4084) ..    0  (4096)     0  (4096)     0  (4096)    

3. We are currently running Lustre 1.8.5. 

Thank you in advance for your help!

Lu Wang
CC-IHEP






> -----原始邮件-----
> 发件人: "WANG Lu" <wanglu at ihep.ac.cn>
> 发送时间: 2011年8月1日 星期一
> 收件人: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> 抄送: 
> 主题: [Lustre-discuss] lvbo_init  failed  after e2fsck
> 
> Dear all, 
>    After an annual e2fsck of all OSTs, two of our OSTs have become read only with error:
> Jul 25 08:37:34 com04 kernel: LDISKFS-fs error (device sdb1): ldiskfs_dx_find_entry: bad entry in directory #222863370: inode out of bounds - offset=3280896, inode=656179638, rec_len=4096, name_len=0
> Jul 25 08:37:34 com04 kernel: Aborting journal on device sdb1-8.
> Jul 25 08:37:34 com04 kernel: LDISKFS-fs (sdb1): Remounting filesystem read-only
>    tune2fs shows the OSTs are at stat "clean with error", after umount and e2fsck again, the two OSTs could be mount normally(and the stat changed to "clean"). 
> 
>    However, we began to  meet hundreds of "lvbo_init failed" on serveral OSTs, not limited on the two OSTs which have been read-only. 
> 
>    Three of our OSTs have met hundreds of lvbo_init faild after an annual e2fsck examination. 
> 
> Aug  1 17:48:26 com04 kernel: LustreError: 5493:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message
> Aug  1 17:59:02 com04 kernel: LustreError: 5632:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001d_UUID: lvbo_init failed for resource 2997406: rc -2
> Aug  1 17:59:02 com04 kernel: LustreError: 5632:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message
> Aug  1 18:10:51 com04 kernel: LustreError: 5602:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001d_UUID: lvbo_init failed for resource 3240254: rc -2
> Aug  1 18:10:51 com04 kernel: LustreError: 5602:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 2 previous similar messages
> Aug  1 18:21:49 com04 kernel: LustreError: 5642:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001f_UUID: lvbo_init failed for resource 3204200: rc -2
> Aug  1 18:21:49 com04 kernel: LustreError: 5642:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 6 previous similar messages
> Aug  1 18:53:18 com04 kernel: LustreError: 5324:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001f_UUID: lvbo_init failed for resource 12856264: rc -2
> Aug  1 18:53:18 com04 kernel: LustreError: 5324:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message
> 
>    According to previous discussions, it seems that the related Objects have been deleted or moved to lost+found. I am not sure: 
> 1.  if the commmand " ll_recover_lost_found_objs" can get back all the lost objects
> 2.  if not, how can I get a list of  demaged files?
> 3.  as users  continuely writing new data to the OSTs, the number of demaged Objects will increase?
> 
> do you have any suggestion? Thank you very much!
> 
> 
> Lu Wang
> Computing Center
> IHEP,China
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list