[lustre-discuss] Errors when starting Lustre on CentOS 6.5

Guillaume Postic guillaume.postic at univ-paris-diderot.fr
Wed Nov 28 06:37:02 PST 2018


Hello,

When running 'mount.lustre /dev/sdb /mdt', I got the following errors:

--------------------------------------------------------------------------------
Nov 28 10:52:27 localhost kernel: LNet: HW CPU cores: 32, npartitions: 
4
Nov 28 10:52:27 localhost kernel: alg: No test for adler32 
(adler32-zlib)
Nov 28 10:52:27 localhost kernel: alg: No test for crc32 (crc32-table)
Nov 28 10:52:27 localhost kernel: alg: No test for crc32 
(crc32-pclmul)
Nov 28 10:52:35 localhost kernel: Lustre: Lustre: Build Version:
2.7.0-RC4--PRISTINE-2.6.32-504.8.1.el6_lustre.x86_64
Nov 28 10:52:35 localhost kernel: LNet: Added LNI 10.0.1.60 at tcp
[8/256/0/180]
Nov 28 10:52:35 localhost kernel: LNet: Added LNI 172.27.7.38 at tcp1
[8/256/0/180]
Nov 28 10:52:35 localhost kernel: LNet: Accept secure, port 988
Nov 28 10:52:37 localhost kernel: LDISKFS-fs (sdb): recovery complete
Nov 28 10:52:37 localhost kernel: LDISKFS-fs (sdb): mounted filesystem
with ordered data mode. quota=on. Opts:
Nov 28 10:52:47 localhost kernel: Lustre: lustre-MDD0000: changelog on
Nov 28 10:52:47 localhost kernel: Lustre: lustre-MDT0000: Will be in
recovery for at least 5:00, or until 112 clients reconnect
Nov 28 10:52:49 localhost kernel: Lustre: lustre-MDT0000: Client
5800a16f-8e18-e4f3-32a0-041e00a27e97 (at 10.0.1.102 at tcp) reconnecting,
waiting for 112 clients in recovery for 4:57
Nov 28 10:52:49 localhost kernel: Lustre:
8210:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent 
has
timed out for slow reply: [sent 1543398764/real 1543398764]
req at ffff88081ef2a080 x1618370892922924/t0(0)
o8->lustre-OST0002-osc-MDT0000 at 10.0.1.63@tcp:28/4 lens 400/544 e 0 to 
1
dl 1543398769 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Nov 28 10:52:49 localhost kernel: LustreError:
8355:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION(
get_current()->journal_info == ((void *)0) ) failed:
Nov 28 10:52:49 localhost kernel: LustreError:
8355:0:(osd_handler.c:1017:osd_trans_start()) LBUG
Nov 28 10:52:49 localhost kernel: Pid: 8355, comm: mdt03_003
Nov 28 10:52:49 localhost kernel:
Nov 28 10:52:49 localhost kernel: Call Trace:
Nov 28 10:52:49 localhost kernel: [<ffffffffa031b895>]
libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Nov 28 10:52:49 localhost kernel: [<ffffffffa031be97>]
lbug_with_loc+0x47/0xb0 [libcfs]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0be424d>]
osd_trans_start+0x25d/0x660 [osd_ldiskfs]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0434b4a>]
llog_osd_destroy+0x42a/0xd40 [obdclass]
Nov 28 10:52:49 localhost kernel: [<ffffffffa042dedc>]
llog_cat_new_log+0x1ec/0x710 [obdclass]

Message from syslogd at localhost at Nov 28 10:52:49 ...
   kernel:LustreError: 8355:0:(osd_handler.c:1017:osd_trans_start())
ASSERTION( get_current()->journal_info == ((void *)0) ) failed:

Message from syslogd at localhost at Nov 28 10:52:49 ...
   kernel:LustreError: 8355:0:(osd_handler.c:1017:osd_trans_start()) 
LBUG
Nov 28 10:52:49 localhost kernel: [<ffffffffa0eab54d>] ?
lod_xattr_set_internal+0x1bd/0x420 [lod]
Nov 28 10:52:49 localhost kernel: [<ffffffffa042e50a>]
llog_cat_add_rec+0x10a/0x450 [obdclass]
Nov 28 10:52:49 localhost kernel: [<ffffffffa04261e9>]
llog_add+0x89/0x1c0 [obdclass]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0f084e2>]
mdd_changelog_store+0x122/0x290 [mdd]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0f08825>]
mdd_changelog_ns_store+0x1d5/0x610 [mdd]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0f0c2c2>] ?
mdd_links_rename+0x2f2/0x530 [mdd]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0f0d76a>] ?
__mdd_index_insert+0x5a/0x160 [mdd]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0f173c8>]
mdd_create+0x12b8/0x1730 [mdd]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0de1cb8>]
mdo_create+0x18/0x50 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0debe6f>]
mdt_reint_open+0x1f8f/0x2c70 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffff8109eefc>] ?
remove_wait_queue+0x3c/0x50
Nov 28 10:52:49 localhost kernel: [<ffffffffa033883c>] ?
upcall_cache_get_entry+0x29c/0x880 [libcfs]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0dd30cd>]
mdt_reint_rec+0x5d/0x200 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0db723b>]
mdt_reint_internal+0x4cb/0x7a0 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0db7706>]
mdt_intent_reint+0x1f6/0x430 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffffa0db5cf4>]
mdt_intent_policy+0x494/0xce0 [mdt]
Nov 28 10:52:49 localhost kernel: [<ffffffffa063f4f9>]
ldlm_lock_enqueue+0x129/0x9d0 [ptlrpc]
Nov 28 10:52:49 localhost kernel: [<ffffffffa066b46b>]
ldlm_handle_enqueue0+0x51b/0x13f0 [ptlrpc]
--------------------------------------------------------------------------------

Does anyone know how to solve that problem?

Build version: 2.7.0-RC4--PRISTINE-2.6.32-504.8.1.el6_lustre.x86_64

Thanks a lot,
Guillaume Postic


More information about the lustre-discuss mailing list