[Lustre-discuss] OST error

Bob Ball ball at umich.edu
Thu Dec 2 13:00:38 PST 2010


We were getting errors thrown by an OST.  /var/log/messages contained a 
lot of these:
2010-11-28T17:05:34-05:00 umfs06.aglt2.org kernel: [2102640.735927] 
LDISKFS-fs error (device sdk): ldiskfs_mb_check_ondisk_bitmap: on-disk 
bitmap for group 639corrupted: 440 blocks free in bitmap, 439 - in gd

So, I turned off (most) access to the disk via lctl (we have a LOT of 
client machines, some were missed) and got problems.  Had to use the 
alternate superblock to e2fsck the disk.  When back online, I still saw 
similar messages.  Updated to e2fsprogs 1.41.12 as suggested elsewhere.  
Repeated e2fsck.

Still seeing these.  Users report some files corrupted, coming up with 
bad md5sum....  Any other thoughts on what to do about this problem?

[2440763.879143] LDISKFS-fs error (device sdk): 
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 35406corrupted: 
1318 blocks free in bitmap, 1317 - in gd
[2440763.879796]
[2440763.882724] LustreError: 
1651027:0:(fsfilt-ldiskfs.c:1333:fsfilt_ldiskfs_write_record()) can't 
read/create block: -28
[2440763.882736] LustreError: 
1651027:0:(llog_lvfs.c:116:llog_lvfs_write_blob()) error writing log 
record: rc -28
[2440763.882789] LustreError: 
1651002:0:(mgc_request.c:1089:mgc_copy_llog()) Failed to copy remote log 
umt3-OST0019 (-28)

Rebooted to make system clean as a whole, and found the same kind of 
thing repeating.
[  285.834864] LDISKFS-fs (sdk): warning: mounting fs with errors, 
running e2fsck is recommended
[  285.852559] LDISKFS-fs (sdk): mounted filesystem with ordered data mode
[  286.079065] LDISKFS-fs (sdk): warning: mounting fs with errors, 
running e2fsck is recommended
[  286.096316] LDISKFS-fs (sdk): mounted filesystem with ordered data mode
[  286.940872] LDISKFS-fs error (device sdk): 
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 35406corrupted: 
1318 blocks free in bitmap, 1317 - in gd
[  286.941693]
[  286.945224] LustreError: 
5790:0:(fsfilt-ldiskfs.c:1333:fsfilt_ldiskfs_write_record()) can't 
read/create block: -28
[  286.945233] LustreError: 
5790:0:(llog_lvfs.c:116:llog_lvfs_write_blob()) error writing log 
record: rc -28
[  286.945448] LustreError: 5763:0:(mgc_request.c:1089:mgc_copy_llog()) 
Failed to copy remote log umt3-OST0019 (-28)

All help appreciated.

bob



More information about the lustre-discuss mailing list