[lustre-discuss] Cannot mount MDT after upgrading from Lustre 2.12.6 to 2.15.3

Audet, Martin Martin.Audet at cnrc-nrc.gc.ca
Tue Sep 26 12:44:02 PDT 2023


Hello all,


I would appreciate if the community would give more attention to this issue because upgrading from 2.12.x to 2.15.x, two LTS versions, is something that we can expect many cluster admin will try to do in the next few months...


We ourselves plan to upgrade a small Lustre (production) system from 2.12.9 to 2.15.3 in the next couple of weeks...

After seeing problems reports like this we start feeling a bit nervous...


The documentation for doing this major update appears to me as not very specific...


In this document for example, https://doc.lustre.org/lustre_manual.xhtml#upgradinglustre , the update process appears not so difficult and there is no mention of using "tunefs.lustre --writeconf" for this kind of update.


Or am I missing something ?


Thanks in advance for providing more tips for this kind of update.


Martin Audet

________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Tung-Han Hsieh via lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: September 23, 2023 2:20 PM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] Cannot mount MDT after upgrading from Lustre 2.12.6 to 2.15.3


***Attention*** This email originated from outside of the NRC. ***Attention*** Ce courriel provient de l'extérieur du CNRC.

Dear All,

Today we tried to upgrade Lustre file system from version 2.12.6 to 2.15.3. But after the work, we cannot mount MDT successfully. Our MDT is ldiskfs backend. The procedure of upgrade is

1. Install the new version of e2fsprogs-1.47.0
2. Install Lustre-2.15.3
3. After reboot, run: tunefs.lustre --writeconf /dev/md0

Then when mounting MDT, we got the error message in dmesg:

===========================================================
[11662.434724] LDISKFS-fs (md0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[11662.584593] Lustre: 3440:0:(scrub.c:189:scrub_file_load()) chome-MDT0000: reset scrub OI count for format change (LU-16655)
[11666.036253] Lustre: MGS: Logs for fs chome were removed by user request.  All servers must be restarted in order to regenerate the logs: rc = 0
[11666.523144] Lustre: chome-MDT0000: Imperative Recovery not enabled, recovery window 300-900
[11666.594098] LustreError: 3440:0:(mdd_device.c:1355:mdd_prepare()) chome-MDD0000: get default LMV of root failed: rc = -2
[11666.594291] LustreError: 3440:0:(obd_mount_server.c:2027:server_fill_super()) Unable to start targets: -2
[11666.594951] Lustre: Failing over chome-MDT0000
[11672.868438] Lustre: 3440:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1695492248/real 1695492248]  req at 000000005dfd9b53 x1777852464760768/t0(0) o251->MGC192.168.32.240 at o2ib@0 at lo:26/25 lens 224/224 e 0 to 1 dl 1695492254 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:''
[11672.925905] Lustre: server umount chome-MDT0000 complete
[11672.926036] LustreError: 3440:0:(super25.c:183:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -2
[11872.893970] LDISKFS-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
============================================================

Could anyone help to solve this problem ? Sorry that it is really urgent.

Thank you very much.

T.H.Hsieh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20230926/4c4472d4/attachment-0001.htm>


More information about the lustre-discuss mailing list