[Lustre-discuss] lvbo_init failed after e2fsck

WANG Lu wanglu at ihep.ac.cn
Mon Aug 1 06:13:34 PDT 2011


Dear all, 
   After an annual e2fsck of all OSTs, two of our OSTs have become read only with error:
Jul 25 08:37:34 com04 kernel: LDISKFS-fs error (device sdb1): ldiskfs_dx_find_entry: bad entry in directory #222863370: inode out of bounds - offset=3280896, inode=656179638, rec_len=4096, name_len=0
Jul 25 08:37:34 com04 kernel: Aborting journal on device sdb1-8.
Jul 25 08:37:34 com04 kernel: LDISKFS-fs (sdb1): Remounting filesystem read-only
   tune2fs shows the OSTs are at stat "clean with error", after umount and e2fsck again, the two OSTs could be mount normally(and the stat changed to "clean"). 

   However, we began to  meet hundreds of "lvbo_init failed" on serveral OSTs, not limited on the two OSTs which have been read-only. 

   Three of our OSTs have met hundreds of lvbo_init faild after an annual e2fsck examination. 

Aug  1 17:48:26 com04 kernel: LustreError: 5493:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message
Aug  1 17:59:02 com04 kernel: LustreError: 5632:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001d_UUID: lvbo_init failed for resource 2997406: rc -2
Aug  1 17:59:02 com04 kernel: LustreError: 5632:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message
Aug  1 18:10:51 com04 kernel: LustreError: 5602:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001d_UUID: lvbo_init failed for resource 3240254: rc -2
Aug  1 18:10:51 com04 kernel: LustreError: 5602:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 2 previous similar messages
Aug  1 18:21:49 com04 kernel: LustreError: 5642:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001f_UUID: lvbo_init failed for resource 3204200: rc -2
Aug  1 18:21:49 com04 kernel: LustreError: 5642:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 6 previous similar messages
Aug  1 18:53:18 com04 kernel: LustreError: 5324:0:(ldlm_resource.c:862:ldlm_resource_add()) filter-publicfs-OST001f_UUID: lvbo_init failed for resource 12856264: rc -2
Aug  1 18:53:18 com04 kernel: LustreError: 5324:0:(ldlm_resource.c:862:ldlm_resource_add()) Skipped 1 previous similar message

   According to previous discussions, it seems that the related Objects have been deleted or moved to lost+found. I am not sure: 
1.  if the commmand " ll_recover_lost_found_objs" can get back all the lost objects
2.  if not, how can I get a list of  demaged files?
3.  as users  continuely writing new data to the OSTs, the number of demaged Objects will increase?

do you have any suggestion? Thank you very much!


Lu Wang
Computing Center
IHEP,China



More information about the lustre-discuss mailing list