[Lustre-discuss] Problem with write_conf

Roger Spellman Roger.Spellman at terascala.com
Tue Aug 3 12:49:24 PDT 2010


If I change the NIDs, and if I don't remove /mnt/mdt/CONFIGS/*-client,
then I get the following when I try mounting a client (note that
10.2.9.1 is the OLD address):

 

mount.lustre: mount 10.2.9.1 at o2ib:/hss2 at /mnt/lustre-hss2 failed:
Cannot send after transport endpoint shutdown

 

dmesg shows:

 

Lustre: Request x1 sent from MGC10.2.9.1 at o2ib to NID 10.2.9.1 at o2ib 5s
ago has timed out (limit 5s).

LustreError: 15c-8: MGC10.2.9.1 at o2ib: The configuration from log
'hss2-client' failed (-108). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.

LustreError: 6285:0:(llite_lib.c:1065:ll_fill_super()) Unable to process
log: -108

Lustre: client ffff81007e98e800 umount complete

LustreError: 6285:0:(obd_mount.c:1991:lustre_fill_super()) Unable to
mount  (-108)

 

Am I missing a step?

 

-Roger

 

________________________________

From: Nathan Rutman [mailto:nathan.rutman at oracle.com] 
Sent: Tuesday, August 03, 2010 2:34 PM
To: Roger Spellman
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Problem with write_conf

 

 

On Aug 3, 2010, at 11:25 AM, Roger Spellman wrote:





Nathan,

 

Thank you.   That works!

 

I found that if I change IP address, I also need to remove the file
/mnt/mdt/CONFIGS/*-client.

 

This is what tunefs.lustre --writeconf on the MDT does, when you first
mount it after the writeconf.

--writeconf on the MDT and all OSTs is the preferred way of changing a
server nid.

 



 

The reason is that the OST mounts failed - the OST was still looking for
the old IP Address.  I grepped for files with the old IP Address, and I
found those client files.


Is that a safe thing to do?  Please note that my mdt and mgs are on the
same LUN.

 

Thanks.

 

-Roger

 

 

________________________________

From: Nathan Rutman [mailto:nathan.rutman at oracle.com] 
Sent: Tuesday, August 03, 2010 2:03 PM
To: Roger Spellman
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Problem with write_conf

 

There's a 'failsafe' feature that  prevents filesystem name changes:

	LustreError: 157-3: Trying to start OBD AFTER-MDT0000_UUID using
the wrong disk BEFORE-MDT0000_UUID. Were the /dev/ assignments
rearranged?

You'll have to go and delete the last_rcvd file off the disk for all the
servers in the filesystem as well as tunefs --writeconf them all to the
name AFTER name.  

 

On Aug 2, 2010, at 6:08 PM, Roger Spellman wrote:






 

Hi,
I would like to be able to change a file system name.  Towards that end,
I have run the following commands as an experiment:

  mkfs.lustre --reformat --fsname BEFORE  --device-size=10000 --mgs
--mdt  --mgsnode=10.2.9.1 at o2ib0 /dev/mapper/map0
  dmesg -c
  mount -t lustre /dev/mapper/map0 /mnt/mdt
  dmesg -c
  umount /mnt/mdt
  dmesg -c
  tunefs.lustre --writeconf --fsname=AFTER --mgs --mdt /dev/mapper/map0
  dmesg -c
  mount -t lustre /dev/mapper/map0 /mnt/mdt
  dmesg -c

Unfortunately, this does not work.  Can someone please explain the
correct sequence of commands to ues?  The output of each command is as
follows.

Thanks.

[root at ts-hss2-01 ~]# mkfs.lustre --reformat --fsname BEFORE
--device-size=10000 --mgs --mdt  --mgsnode=10.2.9.1 at o2ib0
/dev/mapper/map0

   Permanent disk data:
Target:     BEFORE-MDTffff
Index:      unassigned
Lustre FS:  BEFORE
Mount type: ldiskfs
Flags:      0x75
              (MDT MGS needs_index first_time update )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups

device size = 1632256MB
2 6 18
formatting backing filesystem ldiskfs on /dev/mapper/map0
        target name  BEFORE-MDTffff
        4k blocks     2500
        options        -i 4096 -I 512 -q -O
dir_index,extents,uninit_groups -F
mkfs_cmd = mke2fs -j -b 4096 -L BEFORE-MDTffff  -i 4096 -I 512 -q -O
dir_index,extents,uninit_groups -F /dev/mapper/map0 2500
Writing CONFIGS/mountdata
[root at ts-hss2-01 ~]# dmesg -c
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1388, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
LDISKFS-fs: mballoc: 1 extents scanned, 0 goal hits, 1 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 1 generated and it took 2142
LDISKFS-fs: mballoc: 512 preallocated, 0 discarded


[root at ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
[root at ts-hss2-01 ~]# dmesg -c
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1406, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 0 generated and it took 0
LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1410, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
Lustre: MGS MGS started
Lustre: MGC10.2.9.1 at o2ib: Reactivating import
Lustre: Setting parameter BEFORE-MDT0000.mdt.group_upcall in log
BEFORE-MDT0000
Lustre: Enabling user_xattr
Lustre: BEFORE-MDT0000: new disk, initializing
Lustre: BEFORE-MDT0000: Now serving BEFORE-MDT0000 on /dev/mapper/map0
with recovery enabled
Lustre: 1503:0:(lproc_mds.c:271:lprocfs_wr_group_upcall())
BEFORE-MDT0000: group upcall set to /usr/sbin/l_getgroups
Lustre: BEFORE-MDT0000.mdt: set parameter
group_upcall=/usr/sbin/l_getgroups


[root at ts-hss2-01 ~]# umount /mnt/mdt
[root at ts-hss2-01 ~]# dmesg -c
Lustre: Failing over BEFORE-MDT0000
Lustre: Skipped 1 previous similar message
Lustre: *** setting obd BEFORE-MDT0000 device 'dm-4' read-only ***
Turning device dm-4 (0xfd00004) read-only
Lustre: BEFORE-MDT0000: shutting down for failover; client state will be
preserved.
Lustre: MDT BEFORE-MDT0000 has stopped.
LustreError: 1517:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc
-108 from cancel RPC: canceling anyway
LustreError: 1517:0:(ldlm_request.c:1587:ldlm_cli_cancel_list())
ldlm_cli_cancel_list: -108
Lustre: MGS has stopped.
LDISKFS-fs: mballoc: 3 blocks 3 reqs (0 success)
LDISKFS-fs: mballoc: 8 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 1 generated and it took 2598
LDISKFS-fs: mballoc: 1145 preallocated, 0 discarded
Removing read-only on unknown block (0xfd00004)
Lustre: server umount BEFORE-MDT0000 complete


[root at ts-hss2-01 ~]# tunefs.lustre --writeconf --fsname=AFTER --mgs
--mdt /dev/mapper/map0
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

   Read previous values:
Target:     BEFORE-MDT0000
Index:      0
Lustre FS:  BEFORE
Mount type: ldiskfs
Flags:      0x5
              (MDT MGS )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups


   Permanent disk data:
Target:     AFTER-MDT0000
Index:      0
Lustre FS:  AFTER
Mount type: ldiskfs
Flags:      0x105
              (MDT MGS writeconf )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups

Writing CONFIGS/mountdata
[root at ts-hss2-01 ~]# dmesg -c
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1539, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: recovery complete.
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
LDISKFS-fs: mballoc: 6 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 1 generated and it took 2905
LDISKFS-fs: mballoc: 506 preallocated, 0 discarded


[root at ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
mount.lustre: mount /dev/mapper/map0 at /mnt/mdt failed: Invalid
argument
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.
[root at ts-hss2-01 ~]# dmesg -c
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1567, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 0 generated and it took 0
LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
LDISKFS-fs: barriers enabled
kjournald2 starting: pid 1575, dev dm-4:8, commit interval 5 seconds
LDISKFS FS on dm-4, internal journal on dm-4:8
LDISKFS-fs: delayed allocation enabled
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
Lustre: MGS MGS started
Lustre: MGC10.2.9.1 at o2ib: Reactivating import
Lustre: MGS: Logs for fs AFTER were removed by user request.  All
servers must be restarted in order to regenerate the logs.
Lustre: Setting parameter AFTER-MDT0000.mdt.group_upcall in log
AFTER-MDT0000
Lustre: Enabling user_xattr
LustreError: 157-3: Trying to start OBD AFTER-MDT0000_UUID using the
wrong disk BEFORE-MDT0000_UUID. Were the /dev/ assignments rearranged?
LustreError: 1665:0:(mds_fs.c:828:mds_fs_setup()) cannot read last_rcvd:
rc = -22
LustreError: 1665:0:(handler.c:2007:mds_setup()) AFTER-MDT0000: MDS
filesystem method init failed: rc = -22
LustreError: 1665:0:(obd_config.c:372:class_setup()) setup AFTER-MDT0000
failed (-22)
LustreError: 1665:0:(obd_config.c:1199:class_config_llog_handler()) Err
-22 on cfg command:
Lustre:    cmd=cf003 0:AFTER-MDT0000  1:AFTER-MDT0000_UUID  2:0
3:AFTER-MDT0000 
LustreError: 15b-f: MGC10.2.9.1 at o2ib: The configuration from log
'AFTER-MDT0000' failed (-22). Make sure this client and the MGS are
running compatible versions of Lustre.
LustreError: 15c-8: MGC10.2.9.1 at o2ib: The configuration from log
'AFTER-MDT0000' failed (-22). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.
LustreError: 1566:0:(obd_mount.c:1124:server_start_targets()) failed to
start server AFTER-MDT0000: -22
LustreError: 1566:0:(obd_mount.c:1653:server_fill_super()) Unable to
start targets: -22
LustreError: 1566:0:(obd_config.c:443:class_cleanup()) Device 4 not
setup
LustreError: 1566:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc
-108 from cancel RPC: canceling anyway
LustreError: 1566:0:(ldlm_request.c:1587:ldlm_cli_cancel_list())
ldlm_cli_cancel_list: -108
Lustre: MGS has stopped.
LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
LDISKFS-fs: mballoc: 6 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 1 generated and it took 2883
LDISKFS-fs: mballoc: 503 preallocated, 0 discarded
Lustre: 1566:0:(obd_mount.c:1473:server_put_super()) Cleaning orphaned
obd AFTER-mdtlov
Lustre: server umount AFTER-MDT0000 complete
LustreError: 1566:0:(obd_mount.c:2045:lustre_fill_super()) Unable to
mount  (-22)

Roger Spellman
Staff Engineer
Terascala, Inc.
508-588-1501
www.terascala.com <http://www.terascala.com/>
<http://www.terascala.com/>

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100803/21b9137c/attachment.htm>


More information about the lustre-discuss mailing list