[Lustre-discuss] On-disk bitmap corrupted

Lu Wang wanglu at ihep.ac.cn
Mon Apr 19 03:00:37 PDT 2010


Dear  all, 
	 We envolve in  a same situation as problem discussed here:
http://lists.lustre.org/pipermail/lustre-discuss/2009-January/009512.html

One OST is set as read only the first time after it is remounted after a server crash. 



      Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664 blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: Remounting filesystem read-only
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664 blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664 blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664 blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664 blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LustreError: 6158:0:(fsfilt-ldiskfs.c:1288:fsfilt_ldiskfs_write_record()) can't start transaction for 37 blocks (128 bytes)
Apr 16 17:40:31 boss27 kernel: LustreError: 6240:0:(fsfilt-ldiskfs.c:1288:fsfilt_ldiskfs_write_record()) can't start transaction for 37 blocks (128 bytes)
Apr 16 17:40:31 boss27 kernel: LustreError: 6166:0:(fsfilt-ldiskfs.c:470:fsfilt_ldiskfs_brw_start()) can't get handle for 555 credits: rc = -30
@


This OST is unwritable since last Friday, and its disk usage is quite different from neigbour OSTs. 

[root at boss27 ~]# df -h 
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.1T  3.8T  105G  98% /lustre/ost1
/dev/sda2             4.1T  3.7T  158G  96% /lustre/ost2
/dev/sdb1             4.1T  3.8T  124G  97% /lustre/ost3
/dev/sdb2             4.1T  3.8T   84G  98% /lustre/ost4
/dev/sdc1             4.1T  3.7T  128G  97% /lustre/ost5
/dev/sdc2             4.1T  3.7T  131G  97% /lustre/ost6


/dev/sdd1             4.1T  3.3T  591G  85% /lustre/ost7

/dev/sdd2             4.1T  3.8T   52G  99% /lustre/ost8

Is is prossible to fix this problem without lfsck the whole file system? Our system is about 500TB(93% full). We are running Lustre 1.8.1.1. File stripe=1 



Best Regards
Lu Wang
--------------------------------------------------------------	  
Computing Center
IHEP						Office: Computing Center,123 
19B Yuquan Road				Tel: (+86) 10 88236012-607
P.O. Box 918-7				Fax: (+86) 10 8823 6839
Beijing 100049,China		Email: Lu.Wang at ihep.ac.cn							
--------------------------------------------------------------   				
                          





More information about the lustre-discuss mailing list