[lustre-discuss] Kernel Panic on Snapshot Mount

Robert Redl robert.redl at lmu.de
Thu Apr 19 07:26:12 PDT 2018


Dear All,

today, I updated from Lustre 2.10.3 to 2.11.0 (on centos 7.4). The
update is now finished on all servers and everything seems to work fine.
However, when I try to mount a snapshot (we use the ZFS-backend), this
results immediately in a crash of all servers:

Apr 19 16:02:45 server1 kernel: Lustre: 58ffd1e-MDT0000: set dev_rdonly
on this device
Apr 19 16:02:45 server1 kernel: LustreError:
14660:0:(lu_object.c:1178:lu_device_fini()) ASSERTION(
atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
Apr 19 16:02:45 server1 kernel: LustreError:
14660:0:(lu_object.c:1178:lu_device_fini()) LBUG
Apr 19 16:02:45 server1 kernel: Pid: 14660, comm: mount.lustre
Apr 19 16:02:45 server1 kernel:
                                                                    
Call Trace:
Apr 19 16:02:45 server1 kernel:  [<ffffffffc06557ae>]
libcfs_call_trace+0x4e/0x60 [libcfs]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc065583c>]
lbug_with_loc+0x4c/0xb0 [libcfs]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b5502b>]
lu_device_fini+0xbb/0xc0 [obdclass]

Message from syslogd at met-ha-filer05a at Apr 19 16:02:45 ...
 kernel:LustreError: 14660:0:(lu_object.c:1178:lu_device_fini())
ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b59fae>]
dt_device_fini+0xe/0x10 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0da2ea8>]
osd_device_alloc+0x278/0x3b0 [osd_zfs]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b43f7a>]
obd_setup+0x11a/0x2b0 [obdclass]

Message from syslogd at met-ha-filer05a at Apr 19 16:02:45 ...
 kernel:LustreError: 14660:0:(lu_object.c:1178:lu_device_fini()) LBUG
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b443b8>]
class_setup+0x2a8/0x840 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b4882c>]
class_process_config+0x1b5c/0x2810 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffff81333563>] ?
number.isra.2+0x323/0x360
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b4c738>]
do_lcfg+0x258/0x500 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b50f88>]
lustre_start_simple+0x88/0x210 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b7dfba>]
server_fill_super+0xf3a/0x1860 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0660e27>] ?
libcfs_debug_msg+0x57/0x80 [libcfs]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b54228>]
lustre_fill_super+0x328/0x950 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b53f00>] ?
lustre_fill_super+0x0/0x950 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffff8120948f>] mount_nodev+0x4f/0xb0
Apr 19 16:02:45 server1 kernel:  [<ffffffffc0b4c148>]
lustre_mount+0x38/0x60 [obdclass]
Apr 19 16:02:45 server1 kernel:  [<ffffffff81209f1e>] mount_fs+0x3e/0x1b0
Apr 19 16:02:45 server1 kernel:  [<ffffffff81226d57>]
vfs_kern_mount+0x67/0x110
Apr 19 16:02:45 server1 kernel:  [<ffffffff81229263>] do_mount+0x233/0xaf0
Apr 19 16:02:45 server1 kernel:  [<ffffffff8118bb0e>] ?
__get_free_pages+0xe/0x40
Apr 19 16:02:45 server1 kernel:  [<ffffffff81229ea6>] SyS_mount+0x96/0xf0
Apr 19 16:02:45 server1 kernel:  [<ffffffff816c0715>]
system_call_fastpath+0x1c/0x21
Apr 19 16:02:45 server1 kernel:
Apr 19 16:02:45 server1 kernel: Kernel panic - not syncing: LBUG



I'm posting this here as I don't have an account for the actual bug tracker.
Has someone experienced a similar issue?

Best regards
Robert

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: OpenPGP digital signature
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180419/10eea2ca/attachment.sig>


More information about the lustre-discuss mailing list