[lustre-discuss] Lustre and server upgrade

Colin Faber cfaber at gmail.com
Thu Nov 18 14:34:03 PST 2021


The VM will need a full install of all server packages, as well as the
tests package to allow for this test.

On Thu, Nov 18, 2021 at 2:26 PM STEPHENS, DEAN - US <dean.stephens at caci.com>
wrote:

> I have not tried that but I can do that on a new VM that I can create. I
> assume that is all that I need is the lustre-tests RPM and associated
> dependencies and not the full blown lustre install?
>
>
>
> Dean
>
>
>
> *From:* Colin Faber <cfaber at gmail.com>
> *Sent:* Thursday, November 18, 2021 2:22 PM
> *To:* STEPHENS, DEAN - US <dean.stephens at caci.com>
> *Cc:* lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre and server upgrade
>
>
>
> So that indicates that your installation is incomplete or something else
> is preventing lustre, ldiskfs, and possibly other modules from loading.
> Have you been able to reproduce this behavior on a fresh rhel install with
> lustre 2.12.7? (i.e. llmount.sh failing)?
>
>
>
> -cf
>
>
>
>
>
> On Thu, Nov 18, 2021 at 2:20 PM STEPHENS, DEAN - US <
> dean.stephens at caci.com> wrote:
>
> Thanks for the direction. I found it and installed lustre-tests.x86_64 and
> now I have the llmount but it was defaulted to
> /usr/lib64/lustre/tests/llmount.sh and when I ran it but it failed with:
>
>
>
> Stopping clients: <hostname> /mnt/lustre (opts: -f)
>
> Stopping clients: <hostname> /mnt/lustre2 (opts: -f)
>
> Loading modules from /usr/lib64/lustre/tests/..
>
> Detected 2 online CPUs by sysfs
>
> Force libcfs to create 2 CPU partitions
>
> Formatting mgs, mds, osts
>
> Format mds1: /tmp/lustre-mdt1
>
> Mkfs.lustre: Unable to mount /dev/loop0: No such device (even though
> /dev/loop0 is a thing)
> Is the ldiskfs module loaded?
>
>
>
> Mkfs.lustre FATAL: failed to write local files
>
> Mkfs.lustre: exiting with 19 (no such device)
>
>
>
> *From:* Colin Faber <cfaber at gmail.com>
> *Sent:* Thursday, November 18, 2021 2:03 PM
> *To:* STEPHENS, DEAN - US <dean.stephens at caci.com>
> *Cc:* lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre and server upgrade
>
>
>
> This would be part of the lustre-tests RPM package and will install
> llmount.sh to /usr/lib/lustre/tests/llmount.sh I believe.
>
>
>
> On Thu, Nov 18, 2021 at 1:45 PM STEPHENS, DEAN - US <
> dean.stephens at caci.com> wrote:
>
> Not sure what you mean by “If you install the test suite”. I am not seeing
> a llmount.sh file on the server using “locate llmount.sh” at this point.
> What are the steps to install the test suite?
>
>
>
> Dean
>
>
>
> *From:* Colin Faber <cfaber at gmail.com>
> *Sent:* Thursday, November 18, 2021 1:34 PM
> *To:* STEPHENS, DEAN - US <dean.stephens at caci.com>
> *Cc:* lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre and server upgrade
>
>
>
> Hm.. If you install the test suite does llmount.sh succeed? This should
> setup a single node cluster on whatever node you're running lustre on, I
> believe it will load modules as needed (IIRC), if this test succeeds, then
> you know that lustre is installed correctly (or correctly enough), if not,
> I'd focus on the installation as the target issue may be a redheirring
>
>
>
> -cf
>
>
>
>
>
> On Thu, Nov 18, 2021 at 1:01 PM STEPHENS, DEAN - US <
> dean.stephens at caci.com> wrote:
>
> Thanks for the fast reply.
>
> When I do the tunefs.lustre /dev/sdX command I get:
>
> Target: <name>-OST0009
>
> Index: 9
>
>
>
> Target: <name>-OST0008
>
> Index: 8
>
> I spot checked some others and they seem to be good with the exception of
> one. It shows:
>
>
>
> Target: <name>-OST000a
>
> Index: 10
>
>
>
> But since there are 11 LUNs attached that make sense to me.
>
>
>
> As far as the upgrade it was a fresh install using the legacy targets as
> the OSS and MDS nodes are virtual machine with the LUN disks attached to
> them so that Red Hat sees them as /dev/sdX devices.
>
>
>
> When I loaded Lustre on the server I did a yum install lustre and since we
> were pointed at the lustre-2.12 repo in our environment it picked up the
> following RPMs to install:
>
> Luster-resource-agents-2.12.6-1.el7.x86_64
>
> Kmod-lustre-2.12.6-1.el7.x86_64
>
> Kmod-zfs-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64
>
> Kmod-lustre-osd-zfs-2.12.6-1.el7.x86_64
>
> Lustre-2.12.6-1.el7.x86_64
>
> Kmod-spl-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64
>
> Lustre-osd-zfs-mount-2.12.6-1.el7.x86_64
>
> Lustre-osd-ldiskfs-mount-2.12.6-1.el7.x86_64
>
>
>
> Dean
>
>
>
> *From:* Colin Faber <cfaber at gmail.com>
> *Sent:* Thursday, November 18, 2021 12:35 PM
> *To:* STEPHENS, DEAN - US <dean.stephens at caci.com>
> *Cc:* lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre and server upgrade
>
>
>
> EXTERNAL EMAIL - This email originated from outside of CACI. Do not click
> any links or attachments unless you recognize and trust the sender.
>
>
>
>
>
> Hi,
>
>
>
> I believe in 2.10 sometime (someone correct me if I'm wrong) that the
> index parameter was required and needs to be specified. On an existing
> system this should already be set, but can you check the parameters line
> with tunefs.lustre for correct index=N values across your storage nodes?
>
>
>
> Also, with your "upgrade", was this a fresh install utilizing legacy
> targets?
>
>
>
> The last thing I can think of IIRC, there was on-disk format changes
> between 2.5 and 2.12, these should be transparent to you, but it may be
> some other issue is preventing successful upgrade, though the missing
> module error really speaks to possible issues around how lustre was
> installed and loaded on the system.
>
>
>
> Cheers!
>
>
>
> -cf
>
>
>
>
>
> On Thu, Nov 18, 2021 at 12:24 PM STEPHENS, DEAN - US via lustre-discuss <
> lustre-discuss at lists.lustre.org> wrote:
>
> I am by no means a Lustre expert and am seeking some help with our system.
> I am not able to get log file to post as the servers are in the closed area
> with no access to the Internet.
>
>
>
> Here is a bit of history of our system:
>
> The OSS and MDS nodes were RHEL6 and running a Luster server the kernel
> 2.6.32-431.23.3.el6_lustre.x86_64 and the Lustre version of 2.5.3. the
> client version was 2.10. That was in a working state.
>
> We upgraded the OSS ad MDS nodes to RHEL7 and installed Lustre server 2.12
> software and kernel.
>
> The attached 11 LUNs are showing up as /dev/sdb - /dev/sdl
>
> Right now, on the OSS nodes, if I use the command tunefs.luster /dev/sdb I
> get some data back saying that Lustre data has been found but at the bottom
> of the out put it shows “tunefs.lustre: Unable to mount /dev/sdb: No such
> device” and “Is the ldiskfs module available”
>
> When I do a “modprobe -v lustre” I do not see ldiskfs.ko as being loaded
> even though there is a ldiskfs.ko file in
> /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs directory. I
> am not sure how to get it to load in the modprobe command.
>
> I used “insmod
> /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/ ldiskfs.ko”
> and re-ran the “tunefs.luster /dev/sdb” command with the same result.
>
> If I use the same command on the MDS nodes I get “no Lustre data found and
> /dev/sdb has not been formatted with mkfs.lustre”. I am not sure that is
> what is needed here as the MDS nodes do not really have the lustre data as
> it is the meta data server.
>
> I tried to use the command “tunefs.lustre --mgs --erase_params
> --mgsnode=<IP address>@tcp --writeconf --dryrun /dev/sdb” and get the error
> “/dev/sdb has not been formatted with mkfs.lustre”.
>
>
>
> I need some help and guidance and I can provide what may be needed though
> it will need to be typed out as I am not able to get actual log files from
> the system.
>
>
>
> Dean Stephens
>
> CACI
>
> Linux System Admin
>
>
>
>
> ------------------------------
>
>
> This electronic message contains information from CACI International Inc
> or subsidiary companies, which may be company sensitive, proprietary,
> privileged or otherwise protected from disclosure. The information is
> intended to be used solely by the recipient(s) named above. If you are not
> an intended recipient, be aware that any review, disclosure, copying,
> distribution or use of this transmission or its contents is prohibited. If
> you have received this transmission in error, please notify the sender
> immediately.
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20211118/07375135/attachment-0001.html>


More information about the lustre-discuss mailing list