[Lustre-discuss] odd kernel crash after a heartbeat failover
Andreas Dilger
andreas.dilger at oracle.com
Fri Apr 16 16:44:11 PDT 2010
On 2010-04-16, at 11:29, John White wrote:
> Just to follow-up, after enabling netconsole to get some meaningful
> logging out of these OSSs, it is clear that there's a problem with
> the backend storage communication and that this certainly isn't a
> lustre issue. Thanks folks.
>
> On Apr 15, 2010, at 9:45 PM, Cliff White wrote:
>> John White wrote:
>>> This is actually happening repeatedly, any idea if this is a
>>> lustre-side error?
>>> kernel: Unable to handle NULL pointer dereference at
>>> 0000000000000000
>>> kernel: LDISKFS-fs error (device dm-7) in
>>> ldiskfs_reserve_inode_write: Journal has aborted
>>> kernel: Oops: 0002 [1] SMP
>>> kernel: RIP jbd:journal_commit_transaction+0xc33/0x132e
Could you please decode the line for journal_commit_transaction+0xc33
to see what line it is. This Oops shouldn't be happening, even if the
journal has aborted.
Cheers, Andreas
--
Andreas Dilger
Principal Engineer, Lustre Group
Oracle Corporation Canada Inc.
More information about the lustre-discuss
mailing list