[lustre-discuss] kernel panic's with Lustre 2.7

H. Meijering h.meijering at rug.nl
Thu Jul 26 07:17:27 PDT 2018


Hi,

Since last monday we have a strange problem on one of our Lustre
file-systems, the MDT started to crash with resulted in Kernel panics.

The configuration is as follows"
2 metadata servers, one providing the MGS the other the MDT, the metadata
servers  are configured in a HA setup with pacemaker.
18 data servers (oss'es), each if providing 6 OST's, each time 2 OSS's are
configured in a HA setup with pacemaker.

We have configured 59 clients to this setup, 50 'normal'-compute nodes 2
head nodes, 4 GPU-nodes, 2 nodes for ingest (data transport to other
sites)1 node for Robinhood
On the OSS'es en metadata servers are runnig lustre 2.7, all are based on
ext4

We did e2fsck on all the volumes, and after that an lfsck, the following
command was used for the lfsck:
lctl lfsck_start -M cep4-fs-MDT0000 -A -t all -r

This all did not brought us back to business, after mount the clients the
MDT crashed again.
We removed the changelog from the metadata and we are able to use the
filesystem.
When we enabled the changelog again on the MDT there was almost an instant
crash of the MDT
At this moment the process for which this storage cluster is in use depends
on Robinhood and without the Changelog Robinhood doesn't work.

The console log provided us with the following call traces:
[-- MARK -- Mon Jul 23 10:00:00 2018]
Lustre: 12777:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 12777, comm: mdt00_010

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Lustre: 154617:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 154617, comm: mdt00_035

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd][-- MARK --
Mon Jul 23 10:00:00 2018]
Lustre: 12777:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 12777, comm: mdt00_010

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Lustre: 154617:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 154617, comm: mdt00_035

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Lustre: 12813:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 12813, comm: mdt00_020

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

LustreError: 15586:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION(
get_current()->journal_info == ((void *)0) ) failed:
LustreError: 15586:0:(osd_handler.c:1017:osd_trans_start()) LBUG
Pid: 15586, comm: mdt_rdpg00_004

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0502e97>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa0e1524d>] osd_trans_start+0x25d/0x660 [osd_ldiskfs]
 [<ffffffffa061ab4a>] llog_osd_destroy+0x42a/0xd40 [obdclass]
 [<ffffffffa0613edc>] llog_cat_new_log+0x1ec/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa0654fdf>] ? keys_fill+0x6f/0x190 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f18ee>] mdd_close+0x34e/0xc50 [mdd]
 [<ffffffffa0fba801>] mdt_mfd_close+0x3f1/0xac0 [mdt]
 [<ffffffffa0636905>] ? class_handle2object+0x95/0x190 [obdclass]
 [<ffffffffa0fbc313>] mdt_close+0x6f3/0xaa0 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 15586, comm: mdt_rdpg00_004 Not tainted
2.6.32-504.8.1.el6_lustre.x86_64 #1
Call Trace:
 [<ffffffff81529b76>] ? panic+0xa7/0x16f
 [<ffffffffa0502eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa0e1524d>] ? osd_trans_start+0x25d/0x660 [osd_ldiskfs]
 [<ffffffffa061ab4a>] ? llog_osd_destroy+0x42a/0xd40 [obdclass]
 [<ffffffffa0613edc>] ? llog_cat_new_log+0x1ec/0x710 [obdclass]
 [<ffffffffa061450a>] ? llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] ? llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa0654fdf>] ? keys_fill+0x6f/0x190 [obdclass]
 [<ffffffffa10d94e2>] ? mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] ? mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f18ee>] ? mdd_close+0x34e/0xc50 [mdd]
 [<ffffffffa0fba801>] ? mdt_mfd_close+0x3f1/0xac0 [mdt]
 [<ffffffffa0636905>] ? class_handle2object+0x95/0x190 [obdclass]
 [<ffffffffa0fbc313>] ? mdt_close+0x6f3/0xaa0 [mdt]
 [<ffffffffa08d056e>] ? tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ? ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Lustre: 12813:0:(osd_internal.h:1014:osd_trans_exec_op())
cep4-fs-MDT0000-osd: Overflow in tracking declares for index, rb = 4
Pid: 12813, comm: mdt00_020

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0e13d88>] osd_trans_exec_op+0x1f8/0x2e0 [osd_ldiskfs]
 [<ffffffffa0e28b08>] osd_object_ea_create+0x198/0x8c0 [osd_ldiskfs]
 [<ffffffffa063a2eb>] local_object_create+0xdb/0x430 [obdclass]
 [<ffffffffa061dac2>] llog_osd_create+0x3d2/0x800 [obdclass]
 [<ffffffffa060c821>] llog_create+0x81/0x1e0 [obdclass]
 [<ffffffffa0613dd2>] llog_cat_new_log+0xe2/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f0a86>] mdd_xattr_del+0x386/0x3d0 [mdd]
 [<ffffffffa10f3c40>] mdd_xattr_set+0x3c0/0xe40 [mdd]
 [<ffffffffa0870a34>] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc]
 [<ffffffffa0fa804c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
 [<ffffffffa0fb2615>] mdt_reint_setxattr+0x975/0x1810 [mdt]
 [<ffffffffa051f83c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
 [<ffffffffa0fa70cd>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f8b23b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [<ffffffffa0f8b9ab>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

LustreError: 15586:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION(
get_current()->journal_info == ((void *)0) ) failed:
LustreError: 15586:0:(osd_handler.c:1017:osd_trans_start()) LBUG
Pid: 15586, comm: mdt_rdpg00_004

Call Trace:
 [<ffffffffa0502895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0502e97>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa0e1524d>] osd_trans_start+0x25d/0x660 [osd_ldiskfs]
 [<ffffffffa061ab4a>] llog_osd_destroy+0x42a/0xd40 [obdclass]
 [<ffffffffa0613edc>] llog_cat_new_log+0x1ec/0x710 [obdclass]
 [<ffffffffa061450a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa0654fdf>] ? keys_fill+0x6f/0x190 [obdclass]
 [<ffffffffa10d94e2>] mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f18ee>] mdd_close+0x34e/0xc50 [mdd]
 [<ffffffffa0fba801>] mdt_mfd_close+0x3f1/0xac0 [mdt]
 [<ffffffffa0636905>] ? class_handle2object+0x95/0x190 [obdclass]
 [<ffffffffa0fbc313>] mdt_close+0x6f3/0xaa0 [mdt]
 [<ffffffffa08d056e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 15586, comm: mdt_rdpg00_004 Not tainted
2.6.32-504.8.1.el6_lustre.x86_64 #1
Call Trace:
 [<ffffffff81529b76>] ? panic+0xa7/0x16f
 [<ffffffffa0502eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa0e1524d>] ? osd_trans_start+0x25d/0x660 [osd_ldiskfs]
 [<ffffffffa061ab4a>] ? llog_osd_destroy+0x42a/0xd40 [obdclass]
 [<ffffffffa0613edc>] ? llog_cat_new_log+0x1ec/0x710 [obdclass]
 [<ffffffffa061450a>] ? llog_cat_add_rec+0x10a/0x450 [obdclass]
 [<ffffffffa060c1e9>] ? llog_add+0x89/0x1c0 [obdclass]
 [<ffffffffa0654fdf>] ? keys_fill+0x6f/0x190 [obdclass]
 [<ffffffffa10d94e2>] ? mdd_changelog_store+0x122/0x290 [mdd]
 [<ffffffffa10ecd0c>] ? mdd_changelog_data_store+0x16c/0x320 [mdd]
 [<ffffffffa10f18ee>] ? mdd_close+0x34e/0xc50 [mdd]
 [<ffffffffa0fba801>] ? mdt_mfd_close+0x3f1/0xac0 [mdt]
 [<ffffffffa0636905>] ? class_handle2object+0x95/0x190 [obdclass]
 [<ffffffffa0fbc313>] ? mdt_close+0x6f3/0xaa0 [mdt]
 [<ffffffffa08d056e>] ? tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [<ffffffffa08805a1>] ? ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [<ffffffffa087f760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

We hope there is an solution for this problem and that we can go back to
production.

Best regards,
-- 


*Hopko Meijering*

University of Groningen
Center for Information Technology (CIT)

www.rug.nl/cit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180726/662e7a4b/attachment-0001.html>


More information about the lustre-discuss mailing list