[Lustre-discuss] [wc-discuss] can't mount our lustre filesystem after tunefs.lustre --writeconf

Stu Midgley sdm900 at gmail.com
Sun Mar 18 00:36:21 PDT 2012


I'm well down this path... I replaced the mountdata with that from my
small temporary mdt (same name) and that didn't help.

Now, I will do a  few tests on the p1-client.  Perhaps after a write
conf that is basically clean... and I can replace it... but currently
it contains lots of info about each of the OST's.

All the OST's are happy mounting to the mdt and all think that they
are part of our p1 file system.

Thanks.


On Sun, Mar 18, 2012 at 3:04 PM, Kit Westneat <kit.westneat at nyu.edu> wrote:
> Oh right, that makes sense. I guess if I were you I would try one of two
> things. First, back up the MDT, and then try:
> 1) format a small loopback device with the parameters you want the MDT to
> have, then replace the CONFIGS directory on your MDT with the CONFIGS
> directory on the loopback device
> - OR -
> 2) use a hex editor to modify the UUID
>
> Then use tunefs.lustre --print to make sure it all looks good before
> mounting it.
>
> Though one thing I wonder about is, are the OSTs on the same page with the
> fsname? Like are they expecting to be part of the p1 filesystem?
>
> HTH,
> Kit
>
> --
> Kit Westneat
> System Administrator, eSys
> kit.westneat at nyu.edu
> 212-992-7647
>
>
> On Sun, Mar 18, 2012 at 2:40 AM, Dr Stuart Midgley <sdm900 at gmail.com> wrote:
>>
>> No, we have tried that.
>>
>> This file system started life about 6 years ago as lustre 1.4 and has
>> continually been upgraded… hence the whacky UUID.  Trying to rename the FS
>> doesn't work.  It doesn't change the UUID that the mgs tells clients to
>> mount.
>>
>>
>> --
>> Dr Stuart Midgley
>> sdm900 at gmail.com
>>
>>
>>
>> On 18/03/2012, at 2:24 PM, Kit Westneat wrote:
>>
>> > You should be able to reset the UUID by doing another writeconf with the
>> > --fsname flag. After the writeconf, you'll have to writeconf all the OSTs
>> > too.
>> >
>> > It worked on my very simple test at least:
>> > [root at mds1 tmp]# tunefs.lustre --writeconf --fsname=test1 /dev/loop0
>> > checking for existing Lustre data: found CONFIGS/mountdata
>> > Reading CONFIGS/mountdata
>> >
>> >    Read previous values:
>> > Target:     t1-MDT0000
>> > Index:      0
>> > Lustre FS:  t1
>> > Mount type: ldiskfs
>> > Flags:      0x5
>> >               (MDT MGS )
>> > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
>> > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
>> >
>> >
>> >    Permanent disk data:
>> > Target:     test1-MDT0000
>> > Index:      0
>> > Lustre FS:  test1
>> > Mount type: ldiskfs
>> > Flags:      0x105
>> >               (MDT MGS writeconf )
>> > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
>> > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
>> >
>> > Writing CONFIGS/mountdata
>> >
>> >
>> > HTH,
>> > Kit
>> > --
>> > Kit Westneat
>> > System Administrator, eSys
>> > kit.westneat at nyu.edu
>> > 212-992-7647
>> >
>> >
>> > On Sun, Mar 18, 2012 at 1:20 AM, Stu Midgley <sdm900 at gmail.com> wrote:
>> > ok, from what I can tell, the root of the problem is
>> >
>> >
>> > [root at mds001 CONFIGS]# hexdump -C p1-MDT0000  | grep -C 2 mds
>> > 00002450  0b 00 00 00 04 00 00 00  12 00 00 00 00 00 00 00
>> >  |................|
>> > 00002460  70 31 2d 4d 44 54 30 30  30 30 00 00 00 00 00 00
>> >  |p1-MDT0000......|
>> > 00002470  6d 64 73 00 00 00 00 00  70 72 6f 64 5f 6d 64 73
>> >  |mds.....prod_mds|
>> > 00002480  5f 30 30 31 5f 55 55 49  44 00 00 00 00 00 00 00
>> >  |_001_UUID.......|
>> > 00002490  78 00 00 00 07 00 00 00  88 00 00 00 08 00 00 00
>> >  |x...............|
>> > --
>> > 000024c0  00 00 00 00 04 00 00 00  0b 00 00 00 12 00 00 00
>> >  |................|
>> > 000024d0  02 00 00 00 0b 00 00 00  70 31 2d 4d 44 54 30 30
>> >  |........p1-MDT00|
>> > 000024e0  30 30 00 00 00 00 00 00  70 72 6f 64 5f 6d 64 73
>> >  |00......prod_mds|
>> > 000024f0  5f 30 30 31 5f 55 55 49  44 00 00 00 00 00 00 00
>> >  |_001_UUID.......|
>> > 00002500  30 00 00 00 00 00 00 00  70 31 2d 4d 44 54 30 30
>> >  |0.......p1-MDT00|
>> >
>> > [root at mds001 CONFIGS]#
>> > [root at mds001 CONFIGS]# hexdump -C /mnt/md2/CONFIGS/p1-MDT0000 | grep -C
>> > 2 mds
>> > 00002450  0b 00 00 00 04 00 00 00  10 00 00 00 00 00 00 00
>> >  |................|
>> > 00002460  70 31 2d 4d 44 54 30 30  30 30 00 00 00 00 00 00
>> >  |p1-MDT0000......|
>> > 00002470  6d 64 73 00 00 00 00 00  70 31 2d 4d 44 54 30 30
>> >  |mds.....p1-MDT00|
>> > 00002480  30 30 5f 55 55 49 44 00  70 00 00 00 07 00 00 00
>> >  |00_UUID.p.......|
>> > 00002490  80 00 00 00 08 00 00 00  00 00 62 10 ff ff ff ff
>> >  |..........b.....|
>> >
>> >
>> > now if only I can get the UUID to be removed or reset...
>> >
>> >
>> > On Sun, Mar 18, 2012 at 1:05 PM, Dr Stuart Midgley <sdm900 at gmail.com>
>> > wrote:
>> > > hmmm… that didn't work
>> > >
>> > > # tunefs.lustre --force --fsname=p1 /dev/md2
>> > > checking for existing Lustre data: found CONFIGS/mountdata
>> > > Reading CONFIGS/mountdata
>> > >
>> > >   Read previous values:
>> > > Target:     p1-MDT0000
>> > > Index:      0
>> > > UUID:       prod_mds_001_UUID
>> > > Lustre FS:  p1
>> > > Mount type: ldiskfs
>> > > Flags:      0x405
>> > >              (MDT MGS )
>> > > Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>> > > Parameters:
>> > >
>> > > tunefs.lustre: unrecognized option `--force'
>> > > tunefs.lustre: exiting with 22 (Invalid argument)
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Dr Stuart Midgley
>> > > sdm900 at gmail.com
>> > >
>> > >
>> > >
>> > > On 18/03/2012, at 12:17 AM, Nathan Rutman wrote:
>> > >
>> > >> Take them all down again, use tunefs.lustre --force --fsname.
>> > >>
>> > >>
>> > >> On Mar 17, 2012, at 2:10 AM, "Stu Midgley" <sdm900 at gmail.com> wrote:
>> > >>
>> > >>> Afternoon
>> > >>>
>> > >>> We have a rather severe problem with our lustre file system.  We had
>> > >>> a
>> > >>> full config log and the advice was to rewrite it with a new one.
>> > >>>  So,
>> > >>> we unmounted our lustre file system off all clients, unmount all the
>> > >>> ost's and then unmounted the mds.  I then did
>> > >>>
>> > >>> mds:
>> > >>>  tunefs.lustre --writeconf --erase-params /dev/md2
>> > >>>
>> > >>> oss:
>> > >>>  tunefs.lustre --writeconf --erase-params --mgsnode=mds001 /dev/md2
>> > >>>
>> > >>>
>> > >>>
>> > >>> After the tunefs.lustre on the mds I saw
>> > >>>
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS MGS started
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGC172.16.0.251 at tcp:
>> > >>> Reactivating import
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS: Logs for fs p1 were
>> > >>> removed by user request.  All servers must be restarted in order to
>> > >>> regenerate the logs.
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: Enabling user_xattr
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: new disk,
>> > >>> initializing
>> > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: Now serving
>> > >>> p1-MDT0000 on /dev/md2 with recovery enabled
>> > >>>
>> > >>> which scared me a little...
>> > >>>
>> > >>>
>> > >>>
>> > >>> the mds and the oss's mount happily BUT I can't mount the file
>> > >>> system
>> > >>> on my clients... on the mds I see
>> > >>>
>> > >>>
>> > >>> Mar 17 16:42:11 mds001 kernel: LustreError: 137-5: UUID
>> > >>> 'prod_mds_001_UUID' is not available  for connect (no target)
>> > >>>
>> > >>>
>> > >>> On the client I see
>> > >>>
>> > >>>
>> > >>> Mar 17 16:00:06 host kernel: LustreError: 11-0: an error occurred
>> > >>> while communicating with 172.16.0.251 at tcp. The mds_connect operation
>> > >>> failed with -19
>> > >>>
>> > >>>
>> > >>> now, it appears the writeconf renamed the UUID of the mds from
>> > >>> prod_mds_001_UUID to p1-MDT0000_UUID but I can't work out how to get
>> > >>> it back...
>> > >>>
>> > >>>
>> > >>> for example I tried
>> > >>>
>> > >>>
>> > >>> # tunefs.lustre --mgs --mdt --fsname=p1 /dev/md2
>> > >>> checking for existing Lustre data: found CONFIGS/mountdata
>> > >>> Reading CONFIGS/mountdata
>> > >>>
>> > >>> Read previous values:
>> > >>> Target:     p1-MDT0000
>> > >>> Index:      0
>> > >>> UUID:       prod_mds_001_UUID
>> > >>> Lustre FS:  p1
>> > >>> Mount type: ldiskfs
>> > >>> Flags:      0x405
>> > >>>            (MDT MGS )
>> > >>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>> > >>> Parameters:
>> > >>>
>> > >>> tunefs.lustre: cannot change the name of a registered target
>> > >>> tunefs.lustre: exiting with 1 (Operation not permitted)
>> > >>>
>> > >>>
>> > >>>
>> > >>> I'm now stuck not being able to mount a 1PB file system... which
>> > >>> isn't good :(
>> > >>>
>> > >>> --
>> > >>> Dr Stuart Midgley
>> > >>> sdm900 at gmail.com
>> > >>
>> > >> ______________________________________________________________________
>> > >> This email may contain privileged or confidential information, which
>> > >> should only be used for the purpose for which it was sent by Xyratex. No
>> > >> further rights or licenses are granted to use such information. If you are
>> > >> not the intended recipient of this message, please notify the sender by
>> > >> return and delete it. You may not use, copy, disclose or rely on the
>> > >> information contained in it.
>> > >>
>> > >> Internet email is susceptible to data corruption, interception and
>> > >> unauthorised amendment for which Xyratex does not accept liability. While we
>> > >> have taken reasonable precautions to ensure that this email is free of
>> > >> viruses, Xyratex does not accept liability for the presence of any computer
>> > >> viruses in this email, nor for any losses caused as a result of viruses.
>> > >>
>> > >> Xyratex Technology Limited (03134912), Registered in England & Wales,
>> > >> Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
>> > >>
>> > >> The Xyratex group of companies also includes, Xyratex Ltd, registered
>> > >> in Bermuda, Xyratex International Inc, registered in California, Xyratex
>> > >> (Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd
>> > >> registered in The People's Republic of China and Xyratex Japan Limited
>> > >> registered in Japan.
>> > >>
>> > >> ______________________________________________________________________
>> > >>
>> > >>
>> > >
>> >
>> >
>> >
>> > --
>> > Dr Stuart Midgley
>> > sdm900 at gmail.com
>> >
>>
>



-- 
Dr Stuart Midgley
sdm900 at gmail.com



More information about the lustre-discuss mailing list