[Lustre-discuss] OSS Crash
Franck Martinaux
fmartinaux83 at gmail.com
Tue Dec 18 05:44:36 PST 2007
Hi all
I got this on an a crash on a OSS (Lustre 1.6.3) :
root at oss01 ~]# cat /proc/fs/lustre/health_check
device lustre-OST0012 reported unhealthy
device lustre-OST0014 reported unhealthy
device lustre-OST0016 reported unhealthy
NOT HEALTHY
In /var/log/messages we have :
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-15):
read_block_bitmap: Invalid block bitmap - block_group = 10648, block =
348913664
Dec 17 14:40:56 oss01 kernel: Remounting filesystem read-only
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695936(bit
19200 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: Remounting filesystem read-only
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695937(bit
19201 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695938(bit
19202 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695939(bit
19203 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695940(bit
19204 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695941(bit
19205 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695942(bit
19206 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695943(bit
19207 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695944(bit
19208 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695945(bit
19209 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695946(bit
19210 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695947(bit
19211 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695948(bit
19212 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695949(bit
19213 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695950(bit
19214 in group 28402)
Dec 17 14:40:56 oss01 kernel:
Dec 17 14:40:56 oss01 kernel: LDISKFS-fs error (device dm-17):
mb_free_blocks: double-free of inode 232644664's block 930695951(bit
19215 in group 28402)
(....)
Dec 17 14:41:17 oss01 kernel:
Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16):
mb_free_blocks: double-free of inode 214925368's block 859725308(bit
24060 in group 26236)
Dec 17 14:41:17 oss01 kernel:
Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16):
mb_free_blocks: double-free of inode 214925368's block 859725309(bit
24061 in group 26236)
Dec 17 14:41:17 oss01 kernel:
Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16):
mb_free_blocks: double-free of inode 214925368's block 859725310(bit
24062 in group 26236)
Dec 17 14:41:17 oss01 kernel:
Dec 17 14:41:17 oss01 kernel: LDISKFS-fs error (device dm-16):
mb_free_blocks: double-free of inode 214925368's block 859725311(bit
24063 in group 26236)
Dec 17 14:41:17 oss01 kernel:
Dec 17 14:41:17 oss01 kernel: LustreError: 759:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 1 (120
credits): rc -30
Dec 17 15:22:13 oss01 heartbeat: [26083]: info: Checking status of
STONITH device [external/ipmi ]
Dec 17 15:22:13 oss01 heartbeat: [32011]: info: Exiting STONITH-stat
process 26083 returned rc 0.
Dec 17 15:35:05 oss01 kernel: LustreError: 675:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 94: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 726:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 95: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 726:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 1 previous similar message
Dec 17 15:35:05 oss01 kernel: LustreError: 698:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 94: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 698:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 1 previous similar message
Dec 17 15:35:05 oss01 kernel: LustreError: 739:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 97: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 739:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 4 previous similar messages
Dec 17 15:35:05 oss01 kernel: LustreError: 712:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 95: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 712:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 4 previous similar messages
Dec 17 15:35:05 oss01 kernel: LustreError: 670:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 96: rc -2
Dec 17 15:35:05 oss01 kernel: LustreError: 670:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 14 previous similar messages
Dec 17 15:35:16 oss01 kernel: LustreError: 639:0:(ldlm_resource.c:
651:ldlm_resource_add()) lvbo_init failed for resource 98: rc -2
Dec 17 15:35:16 oss01 kernel: LustreError: 639:0:(ldlm_resource.c:
651:ldlm_resource_add()) Skipped 6 previous similar messages
Dec 17 15:54:53 oss01 kernel: LustreError: 777:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:53 oss01 kernel: LustreError: 799:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:53 oss01 kernel: LustreError: 799:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages
Dec 17 15:54:53 oss01 kernel: LustreError: 830:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:53 oss01 kernel: LustreError: 830:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages
Dec 17 15:54:53 oss01 kernel: LustreError: 860:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:53 oss01 kernel: LustreError: 860:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages
Dec 17 15:54:54 oss01 kernel: LustreError: 809:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:54 oss01 kernel: LustreError: 809:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) Skipped 2 previous similar messages
Dec 17 15:54:54 oss01 kernel: LustreError: 859:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) error starting handle for op 8 (49
credits): rc -30
Dec 17 15:54:54 oss01 kernel: LustreError: 859:0:(fsfilt-ldiskfs.c:
281:fsfilt_ldiskfs_start()) Skipped 5 previous similar messages
I solve the problem by umounting and remounting the 3 OSTs.
Is it a bug relative to 1.6.3 ? ext4 ?
What is the status for 1.6.4.1 ?
Best Regards,
Franck
More information about the lustre-discuss
mailing list