[lustre-discuss] Lustre/ZFS snapshots mount error

Andreas Dilger adilger at whamcloud.com
Mon Aug 27 14:56:57 PDT 2018


It's probably best to file an LU ticket for this issue.

It looks like there is something with the log processing at mount that is trying to modify the configuration files.  I'm not sure whether that should be allowed or not.

Does fab have the same MGS as fsA?  Does it have the same MDS node as fsA?
If it has a different MDS, you might consider to give it its own MGS as well.
That doesn't have to be a separate MGS node, just a separate filesystem (ZFS fileset in the same zpool) on the MDS node.

Cheers, Andreas

> On Aug 27, 2018, at 10:18, Kirk, Benjamin (JSC-EG311) <benjamin.kirk at nasa.gov> wrote:
> 
> Hi all,
> 
> We have two filesystems, fsA & fsB (eadc below).  Both of which get snapshots taken daily, rotated over a week.  It’s a beautiful feature we’ve been using in production ever since it was introduced with 2.10.
> 
> -) We’ve got Lustre/ZFS 2.10.4 on CentOS 7.5.
> -) Both fsA & fsB have changelogs active.
> -) fsA has combined mgt/mdt on a single ZFS filesystem.
> -) fsB has a single mdt on a single ZFS filesystem.
> -) for fsA, I have no issues mounting any of the snapshots via lctl.
> -) for fsB, I can mount the most three recent snapshots, then encounter errors:
> 
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_mount -F eadc -n eadc_AutoSS-Mon
> mounted the snapshot eadc_AutoSS-Mon with fsname 3d40bbc
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_umount -F eadc -n eadc_AutoSS-Mon
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_mount -F eadc -n eadc_AutoSS-Sun
> mounted the snapshot eadc_AutoSS-Sun with fsname 584c07a
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_umount -F eadc -n eadc_AutoSS-Sun
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_mount -F eadc -n eadc_AutoSS-Sat
> mounted the snapshot eadc_AutoSS-Sat with fsname 4e646fe
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_umount -F eadc -n eadc_AutoSS-Sat
> [root at hpfs-fsl-mds0 ~]# lctl snapshot_mount -F eadc -n eadc_AutoSS-Fri
> mount.lustre: mount metadata/meta-eadc at eadc_AutoSS-Fri at /mnt/eadc_AutoSS-Fri_MDT0000 failed: Read-only file system
> Can't mount the snapshot eadc_AutoSS-Fri: Read-only file system
> 
> The relevant bits from dmesg are
> [1353434.417762] Lustre: 3d40bbc-MDT0000: set dev_rdonly on this device
> [1353434.417765] Lustre: Skipped 3 previous similar messages
> [1353434.649480] Lustre: 3d40bbc-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
> [1353434.649484] Lustre: Skipped 3 previous similar messages
> [1353434.866228] Lustre: 3d40bbc-MDD0000: changelog on
> [1353434.866233] Lustre: Skipped 1 previous similar message
> [1353435.427744] Lustre: 3d40bbc-MDT0000: Connection restored to ... at tcp (at ... at tcp)
> [1353435.427747] Lustre: Skipped 23 previous similar messages
> [1353445.255899] Lustre: Failing over 3d40bbc-MDT0000
> [1353445.255903] Lustre: Skipped 3 previous similar messages
> [1353445.256150] LustreError: 11-0: 3d40bbc-OST0000-osc-MDT0000: operation ost_disconnect to node ... at tcp failed: rc = -107
> [1353445.257896] LustreError: Skipped 23 previous similar messages
> [1353445.353874] Lustre: server umount 3d40bbc-MDT0000 complete
> [1353445.353877] Lustre: Skipped 3 previous similar messages
> [1353475.302224] Lustre: 4e646fe-MDD0000: changelog on
> [1353475.302228] Lustre: Skipped 1 previous similar message
> [1353498.964016] LustreError: 25582:0:(osd_handler.c:341:osd_trans_create()) 36ca26b-MDT0000-osd: someone try to start transaction under readonly mode, should be disabled.
> [1353498.967260] LustreError: 25582:0:(osd_handler.c:341:osd_trans_create()) Skipped 1 previous similar message
> [1353498.968829] CPU: 6 PID: 25582 Comm: mount.lustre Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.6.3.el7.x86_64 #1
> [1353498.968830] Hardware name: Supermicro SYS-6027TR-D71FRF/X9DRT, BIOS 3.2a 08/04/2015
> [1353498.968832] Call Trace:
> [1353498.968841]  [<ffffffffb5b0e80e>] dump_stack+0x19/0x1b
> [1353498.968851]  [<ffffffffc0cbe5db>] osd_trans_create+0x38b/0x3d0 [osd_zfs]
> [1353498.968876]  [<ffffffffc1116044>] llog_destroy+0x1f4/0x3f0 [obdclass]
> [1353498.968887]  [<ffffffffc111f0f6>] llog_cat_reverse_process_cb+0x246/0x3f0 [obdclass]
> [1353498.968897]  [<ffffffffc111a32c>] llog_reverse_process+0x38c/0xaa0 [obdclass]
> [1353498.968910]  [<ffffffffc111eeb0>] ? llog_cat_process_cb+0x4e0/0x4e0 [obdclass]
> [1353498.968922]  [<ffffffffc111af69>] llog_cat_reverse_process+0x179/0x270 [obdclass]
> [1353498.968932]  [<ffffffffc1115585>] ? llog_init_handle+0xd5/0x9a0 [obdclass]
> [1353498.968943]  [<ffffffffc1116e78>] ? llog_open_create+0x78/0x320 [obdclass]
> [1353498.968949]  [<ffffffffc12e55f0>] ? mdd_root_get+0xf0/0xf0 [mdd]
> [1353498.968954]  [<ffffffffc12ec7af>] mdd_prepare+0x13ff/0x1c70 [mdd]
> [1353498.968966]  [<ffffffffc166b037>] mdt_prepare+0x57/0x3b0 [mdt]
> [1353498.968983]  [<ffffffffc1183afd>] server_start_targets+0x234d/0x2bd0 [obdclass]
> [1353498.968999]  [<ffffffffc1153500>] ? class_config_dump_handler+0x7e0/0x7e0 [obdclass]
> [1353498.969012]  [<ffffffffc118541d>] server_fill_super+0x109d/0x185a [obdclass]
> [1353498.969025]  [<ffffffffc115cef8>] lustre_fill_super+0x328/0x950 [obdclass]
> [1353498.969038]  [<ffffffffc115cbd0>] ? lustre_common_put_super+0x270/0x270 [obdclass]
> [1353498.969041]  [<ffffffffb561f3bf>] mount_nodev+0x4f/0xb0
> [1353498.969053]  [<ffffffffc1154f18>] lustre_mount+0x38/0x60 [obdclass]
> [1353498.969055]  [<ffffffffb561ff3e>] mount_fs+0x3e/0x1b0
> [1353498.969060]  [<ffffffffb563d4b7>] vfs_kern_mount+0x67/0x110
> [1353498.969062]  [<ffffffffb563fadf>] do_mount+0x1ef/0xce0
> [1353498.969066]  [<ffffffffb55f7c2c>] ? kmem_cache_alloc_trace+0x3c/0x200
> [1353498.969069]  [<ffffffffb5640913>] SyS_mount+0x83/0xd0
> [1353498.969074]  [<ffffffffb5b20795>] system_call_fastpath+0x1c/0x21
> [1353498.969079] LustreError: 25582:0:(llog_cat.c:1027:llog_cat_reverse_process_cb()) 36ca26b-MDD0000: fail to destroy empty log: rc = -30
> [1353498.970785] CPU: 6 PID: 25582 Comm: mount.lustre Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.6.3.el7.x86_64 #1
> [1353498.970786] Hardware name: Supermicro SYS-6027TR-D71FRF/X9DRT, BIOS 3.2a 08/04/2015
> [1353498.970787] Call Trace:
> [1353498.970790]  [<ffffffffb5b0e80e>] dump_stack+0x19/0x1b
> [1353498.970795]  [<ffffffffc0cbe5db>] osd_trans_create+0x38b/0x3d0 [osd_zfs]
> [1353498.970807]  [<ffffffffc1117921>] llog_cancel_rec+0xc1/0x880 [obdclass]
> [1353498.970817]  [<ffffffffc111e13b>] llog_cat_cleanup+0xdb/0x380 [obdclass]
> [1353498.970827]  [<ffffffffc111f14d>] llog_cat_reverse_process_cb+0x29d/0x3f0 [obdclass]
> [1353498.970838]  [<ffffffffc111a32c>] llog_reverse_process+0x38c/0xaa0 [obdclass]
> [1353498.970848]  [<ffffffffc111eeb0>] ? llog_cat_process_cb+0x4e0/0x4e0 [obdclass]
> [1353498.970858]  [<ffffffffc111af69>] llog_cat_reverse_process+0x179/0x270 [obdclass]
> [1353498.970868]  [<ffffffffc1115585>] ? llog_init_handle+0xd5/0x9a0 [obdclass]
> [1353498.970878]  [<ffffffffc1116e78>] ? llog_open_create+0x78/0x320 [obdclass]
> [1353498.970883]  [<ffffffffc12e55f0>] ? mdd_root_get+0xf0/0xf0 [mdd]
> [1353498.970887]  [<ffffffffc12ec7af>] mdd_prepare+0x13ff/0x1c70 [mdd]
> [1353498.970894]  [<ffffffffc166b037>] mdt_prepare+0x57/0x3b0 [mdt]
> [1353498.970908]  [<ffffffffc1183afd>] server_start_targets+0x234d/0x2bd0 [obdclass]
> [1353498.970924]  [<ffffffffc1153500>] ? class_config_dump_handler+0x7e0/0x7e0 [obdclass]
> [1353498.970938]  [<ffffffffc118541d>] server_fill_super+0x109d/0x185a [obdclass]
> [1353498.970950]  [<ffffffffc115cef8>] lustre_fill_super+0x328/0x950 [obdclass]
> [1353498.970962]  [<ffffffffc115cbd0>] ? lustre_common_put_super+0x270/0x270 [obdclass]
> [1353498.970964]  [<ffffffffb561f3bf>] mount_nodev+0x4f/0xb0
> [1353498.970976]  [<ffffffffc1154f18>] lustre_mount+0x38/0x60 [obdclass]
> [1353498.970978]  [<ffffffffb561ff3e>] mount_fs+0x3e/0x1b0
> [1353498.970980]  [<ffffffffb563d4b7>] vfs_kern_mount+0x67/0x110
> [1353498.970982]  [<ffffffffb563fadf>] do_mount+0x1ef/0xce0
> [1353498.970984]  [<ffffffffb55f7c2c>] ? kmem_cache_alloc_trace+0x3c/0x200
> [1353498.970986]  [<ffffffffb5640913>] SyS_mount+0x83/0xd0
> [1353498.970989]  [<ffffffffb5b20795>] system_call_fastpath+0x1c/0x21
> [1353498.970996] LustreError: 25582:0:(mdd_device.c:354:mdd_changelog_llog_init()) 36ca26b-MDD0000: changelog init failed: rc = -30
> [1353498.972790] LustreError: 25582:0:(mdd_device.c:427:mdd_changelog_init()) 36ca26b-MDD0000: changelog setup during init failed: rc = -30
> [1353498.974525] LustreError: 25582:0:(mdd_device.c:1061:mdd_prepare()) 36ca26b-MDD0000: failed to initialize changelog: rc = -30
> [1353498.976229] LustreError: 25582:0:(obd_mount_server.c:1879:server_fill_super()) Unable to start targets: -30
> [1353499.072002] LustreError: 25582:0:(obd_mount.c:1582:lustre_fill_super()) Unable to mount  (-30)
> 
> 
> I’m hoping those traces mean something to someone - any ideas?
> 
> Thanks!
> 
> --
> Benjamin S. Kirk
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180827/40b907cd/attachment.sig>


More information about the lustre-discuss mailing list