[Lustre-discuss] OSS crash after LDISKFS-fs error

Jim Garlick garlick at llnl.gov
Sun Nov 11 11:19:56 PST 2007


I thought I'd check if we had that fix but got 'You are not authorized
to access bug #13620'.  Any chance of having that fixed?
Jim

On Sun, 11 Nov 2007, Alex Tomas wrote:

> I think this is bz13620. rhel4 kernel has a bug when two instances
> of same inode can co-exist in the cache. you can find the fix in
> https://bugzilla.lustre.org/show_bug.cgi?id=13620
>
> thanks, Alex
>
> Wojciech Turek wrote:
>> Hi,
>>
>> My lustre environment is: 2.6.9-55.0.9.EL_lustre.1.6.3smp
>>
>> One of my OSS's crashed today. Below you can see messages sent by it
>> (storage09) to the syslog (first three lines). Then it died (my guess is
>> with kernel panic) and heartbeat software STONITH that OSS's.
>>
>> Nov  9 19:08:44 storage09.beowulf.cluster kernel: LDISKFS-fs error
>> (device dm-5): mb_free_blocks: double-free of inode 38887437's block
>> 155560192(bit 10496 in group 4747)
>> Nov  9 19:08:44 storage09.beowulf.cluster kernel:  Nov  9 19:08:44
>> storage09.beowulf.cluster kernel: Remounting filesystem read-only
>> Nov  9 19:08:44 storage09.beowulf.cluster kernel: LDISKFS-fs error
>> (device dm-5): mb_free_blocks: double-free of inode 38887437's block
>> 155560193(bit 10497 in group 4747)
>> Nov  9 19:09:13 storage10.beowulf.cluster heartbeat: [21231]: WARN: node
>> storage09: is dead Nov  9 19:09:13 storage10.beowulf.cluster heartbeat:
>> [21231]: info: Link storage09:eth0 dead.
>> Nov  9 19:09:13 storage10.beowulf.cluster heartbeat: [21231]: info: Link
>> storage09:eth2 dead. Nov  9 19:09:13 storage10.beowulf.cluster
>> heartbeat: [32414]: info: Resetting node storage09 with [external/ipmi ]
>>
>> Do you know how serious are LDISKFS-fs errors? Is that indicates data
>> corruption on the certain block device? Device dm-5 is a DDN LUN.  DDN
>> controller S2A9500 says that everything is Healthy there.
>>
>> Cheers
>>
>> Wojciech Turek
>>
>>
>> Mr Wojciech Turek
>> Assistant System Manager
>> University of Cambridge
>> High Performance Computing service
>> email: wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>
>> tel. +441223763517
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>




More information about the lustre-discuss mailing list