[lustre-discuss] CPU soft lockup on mkfs.lustre

Degremont, Aurelien degremoa at amazon.com
Tue Sep 10 10:22:15 PDT 2019


I saw the same issue and downgraded my kernel back to RHEL kernel 3.10.0-957.

But you probably wants to keep 7.7 kernel ? :)

De : lustre-discuss <lustre-discuss-bounces at lists.lustre.org> au nom de Tamas Kazinczy <tamas.kazinczy at kifu.gov.hu>
Organisation : Governmental Agency for IT Development
Date : mardi 10 septembre 2019 à 14:39
À : "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Objet : [lustre-discuss] CPU soft lockup on mkfs.lustre


Hi,

I've successfully compiled Lustre after the

'LU-12457 kernel: RHEL 7.7 server support' change

but when I try to create an MGS with LDISKFS backend

on RHEL 7.7 I get CPU soft lockup:

 kernel:NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [mkfs.lustre:31220]



Looked into logs for more details:
Sep  6 10:41:00 mgs1 kernel: Call Trace:

Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd73365>] queued_spin_lock_slowpath+0xb/0xf
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd81ad0>] _raw_spin_lock+0x20/0x30
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b865e2e>] igrab+0x1e/0x60
Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06bd88b>] ldiskfs_quota_off+0x3b/0x130 [ldiskfs]
Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06c091d>] ldiskfs_put_super+0x4d/0x400 [ldiskfs]
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b13d>] generic_shutdown_super+0x6d/0x100
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b5b7>] kill_block_super+0x27/0x70
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b91e>] deactivate_locked_super+0x4e/0x70
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84c0a6>] deactivate_super+0x46/0x60
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86abff>] cleanup_mnt+0x3f/0x80
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86ac92>] __cleanup_mnt+0x12/0x20
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b6c1c0b>] task_work_run+0xbb/0xe0
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b62cc65>] do_notify_resume+0xa5/0xc0
Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd8c23b>] int_signal+0x12/0x17
Sep  6 10:41:00 mgs1 kernel: Code: 47 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 66 90 b9 01 00 00 00 8b 17 85 d2 74 0d 83 fa 03 74 08 f3 90 <8b> 17 85 d2 75 f3 89 d0 f0 0f b1 0f 39 c2 75 e3 5d 66 90 c3 0f



Has anyone else also experienced this issue?

Is there a way to make this work?



Thanks,

--

Tamás Kazinczy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190910/a38e279d/attachment.html>


More information about the lustre-discuss mailing list