[Lustre-discuss] [wc-discuss] can't mount our lustre filesystem after tunefs.lustre --writeconf

Kit Westneat kit.westneat at nyu.edu
Sun Mar 18 00:04:44 PDT 2012


Oh right, that makes sense. I guess if I were you I would try one of two
things. First, back up the MDT, and then try:
1) format a small loopback device with the parameters you want the MDT to
have, then replace the CONFIGS directory on your MDT with the CONFIGS
directory on the loopback device
- OR -
2) use a hex editor to modify the UUID

Then use tunefs.lustre --print to make sure it all looks good before
mounting it.

Though one thing I wonder about is, are the OSTs on the same page with the
fsname? Like are they expecting to be part of the p1 filesystem?

HTH,
Kit

--
Kit Westneat
System Administrator, eSys
kit.westneat at nyu.edu
212-992-7647


On Sun, Mar 18, 2012 at 2:40 AM, Dr Stuart Midgley <sdm900 at gmail.com> wrote:

> No, we have tried that.
>
> This file system started life about 6 years ago as lustre 1.4 and has
> continually been upgraded… hence the whacky UUID.  Trying to rename the FS
> doesn't work.  It doesn't change the UUID that the mgs tells clients to
> mount.
>
>
> --
> Dr Stuart Midgley
> sdm900 at gmail.com
>
>
>
> On 18/03/2012, at 2:24 PM, Kit Westneat wrote:
>
> > You should be able to reset the UUID by doing another writeconf with the
> --fsname flag. After the writeconf, you'll have to writeconf all the OSTs
> too.
> >
> > It worked on my very simple test at least:
> > [root at mds1 tmp]# tunefs.lustre --writeconf --fsname=test1 /dev/loop0
> > checking for existing Lustre data: found CONFIGS/mountdata
> > Reading CONFIGS/mountdata
> >
> >    Read previous values:
> > Target:     t1-MDT0000
> > Index:      0
> > Lustre FS:  t1
> > Mount type: ldiskfs
> > Flags:      0x5
> >               (MDT MGS )
> > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
> > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
> >
> >
> >    Permanent disk data:
> > Target:     test1-MDT0000
> > Index:      0
> > Lustre FS:  test1
> > Mount type: ldiskfs
> > Flags:      0x105
> >               (MDT MGS writeconf )
> > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
> > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
> >
> > Writing CONFIGS/mountdata
> >
> >
> > HTH,
> > Kit
> > --
> > Kit Westneat
> > System Administrator, eSys
> > kit.westneat at nyu.edu
> > 212-992-7647
> >
> >
> > On Sun, Mar 18, 2012 at 1:20 AM, Stu Midgley <sdm900 at gmail.com> wrote:
> > ok, from what I can tell, the root of the problem is
> >
> >
> > [root at mds001 CONFIGS]# hexdump -C p1-MDT0000  | grep -C 2 mds
> > 00002450  0b 00 00 00 04 00 00 00  12 00 00 00 00 00 00 00
>  |................|
> > 00002460  70 31 2d 4d 44 54 30 30  30 30 00 00 00 00 00 00
>  |p1-MDT0000......|
> > 00002470  6d 64 73 00 00 00 00 00  70 72 6f 64 5f 6d 64 73
>  |mds.....prod_mds|
> > 00002480  5f 30 30 31 5f 55 55 49  44 00 00 00 00 00 00 00
>  |_001_UUID.......|
> > 00002490  78 00 00 00 07 00 00 00  88 00 00 00 08 00 00 00
>  |x...............|
> > --
> > 000024c0  00 00 00 00 04 00 00 00  0b 00 00 00 12 00 00 00
>  |................|
> > 000024d0  02 00 00 00 0b 00 00 00  70 31 2d 4d 44 54 30 30
>  |........p1-MDT00|
> > 000024e0  30 30 00 00 00 00 00 00  70 72 6f 64 5f 6d 64 73
>  |00......prod_mds|
> > 000024f0  5f 30 30 31 5f 55 55 49  44 00 00 00 00 00 00 00
>  |_001_UUID.......|
> > 00002500  30 00 00 00 00 00 00 00  70 31 2d 4d 44 54 30 30
>  |0.......p1-MDT00|
> >
> > [root at mds001 CONFIGS]#
> > [root at mds001 CONFIGS]# hexdump -C /mnt/md2/CONFIGS/p1-MDT0000 | grep -C
> 2 mds
> > 00002450  0b 00 00 00 04 00 00 00  10 00 00 00 00 00 00 00
>  |................|
> > 00002460  70 31 2d 4d 44 54 30 30  30 30 00 00 00 00 00 00
>  |p1-MDT0000......|
> > 00002470  6d 64 73 00 00 00 00 00  70 31 2d 4d 44 54 30 30
>  |mds.....p1-MDT00|
> > 00002480  30 30 5f 55 55 49 44 00  70 00 00 00 07 00 00 00
>  |00_UUID.p.......|
> > 00002490  80 00 00 00 08 00 00 00  00 00 62 10 ff ff ff ff
>  |..........b.....|
> >
> >
> > now if only I can get the UUID to be removed or reset...
> >
> >
> > On Sun, Mar 18, 2012 at 1:05 PM, Dr Stuart Midgley <sdm900 at gmail.com>
> wrote:
> > > hmmm… that didn't work
> > >
> > > # tunefs.lustre --force --fsname=p1 /dev/md2
> > > checking for existing Lustre data: found CONFIGS/mountdata
> > > Reading CONFIGS/mountdata
> > >
> > >   Read previous values:
> > > Target:     p1-MDT0000
> > > Index:      0
> > > UUID:       prod_mds_001_UUID
> > > Lustre FS:  p1
> > > Mount type: ldiskfs
> > > Flags:      0x405
> > >              (MDT MGS )
> > > Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
> > > Parameters:
> > >
> > > tunefs.lustre: unrecognized option `--force'
> > > tunefs.lustre: exiting with 22 (Invalid argument)
> > >
> > >
> > >
> > >
> > > --
> > > Dr Stuart Midgley
> > > sdm900 at gmail.com
> > >
> > >
> > >
> > > On 18/03/2012, at 12:17 AM, Nathan Rutman wrote:
> > >
> > >> Take them all down again, use tunefs.lustre --force --fsname.
> > >>
> > >>
> > >> On Mar 17, 2012, at 2:10 AM, "Stu Midgley" <sdm900 at gmail.com> wrote:
> > >>
> > >>> Afternoon
> > >>>
> > >>> We have a rather severe problem with our lustre file system.  We had
> a
> > >>> full config log and the advice was to rewrite it with a new one.  So,
> > >>> we unmounted our lustre file system off all clients, unmount all the
> > >>> ost's and then unmounted the mds.  I then did
> > >>>
> > >>> mds:
> > >>>  tunefs.lustre --writeconf --erase-params /dev/md2
> > >>>
> > >>> oss:
> > >>>  tunefs.lustre --writeconf --erase-params --mgsnode=mds001 /dev/md2
> > >>>
> > >>>
> > >>>
> > >>> After the tunefs.lustre on the mds I saw
> > >>>
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS MGS started
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGC172.16.0.251 at tcp:
> Reactivating import
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS: Logs for fs p1 were
> > >>> removed by user request.  All servers must be restarted in order to
> > >>> regenerate the logs.
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: Enabling user_xattr
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: new disk,
> initializing
> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: Now serving
> > >>> p1-MDT0000 on /dev/md2 with recovery enabled
> > >>>
> > >>> which scared me a little...
> > >>>
> > >>>
> > >>>
> > >>> the mds and the oss's mount happily BUT I can't mount the file system
> > >>> on my clients... on the mds I see
> > >>>
> > >>>
> > >>> Mar 17 16:42:11 mds001 kernel: LustreError: 137-5: UUID
> > >>> 'prod_mds_001_UUID' is not available  for connect (no target)
> > >>>
> > >>>
> > >>> On the client I see
> > >>>
> > >>>
> > >>> Mar 17 16:00:06 host kernel: LustreError: 11-0: an error occurred
> > >>> while communicating with 172.16.0.251 at tcp. The mds_connect operation
> > >>> failed with -19
> > >>>
> > >>>
> > >>> now, it appears the writeconf renamed the UUID of the mds from
> > >>> prod_mds_001_UUID to p1-MDT0000_UUID but I can't work out how to get
> > >>> it back...
> > >>>
> > >>>
> > >>> for example I tried
> > >>>
> > >>>
> > >>> # tunefs.lustre --mgs --mdt --fsname=p1 /dev/md2
> > >>> checking for existing Lustre data: found CONFIGS/mountdata
> > >>> Reading CONFIGS/mountdata
> > >>>
> > >>> Read previous values:
> > >>> Target:     p1-MDT0000
> > >>> Index:      0
> > >>> UUID:       prod_mds_001_UUID
> > >>> Lustre FS:  p1
> > >>> Mount type: ldiskfs
> > >>> Flags:      0x405
> > >>>            (MDT MGS )
> > >>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
> > >>> Parameters:
> > >>>
> > >>> tunefs.lustre: cannot change the name of a registered target
> > >>> tunefs.lustre: exiting with 1 (Operation not permitted)
> > >>>
> > >>>
> > >>>
> > >>> I'm now stuck not being able to mount a 1PB file system... which
> isn't good :(
> > >>>
> > >>> --
> > >>> Dr Stuart Midgley
> > >>> sdm900 at gmail.com
> > >> ______________________________________________________________________
> > >> This email may contain privileged or confidential information, which
> should only be used for the purpose for which it was sent by Xyratex. No
> further rights or licenses are granted to use such information. If you are
> not the intended recipient of this message, please notify the sender by
> return and delete it. You may not use, copy, disclose or rely on the
> information contained in it.
> > >>
> > >> Internet email is susceptible to data corruption, interception and
> unauthorised amendment for which Xyratex does not accept liability. While
> we have taken reasonable precautions to ensure that this email is free of
> viruses, Xyratex does not accept liability for the presence of any computer
> viruses in this email, nor for any losses caused as a result of viruses.
> > >>
> > >> Xyratex Technology Limited (03134912), Registered in England & Wales,
> Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
> > >>
> > >> The Xyratex group of companies also includes, Xyratex Ltd, registered
> in Bermuda, Xyratex International Inc, registered in California, Xyratex
> (Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd
> registered in The People's Republic of China and Xyratex Japan Limited
> registered in Japan.
> > >> ______________________________________________________________________
> > >>
> > >>
> > >
> >
> >
> >
> > --
> > Dr Stuart Midgley
> > sdm900 at gmail.com
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120318/6d4366e9/attachment.htm>


More information about the lustre-discuss mailing list