[Lustre-discuss] Problem with write_conf

Nathan Rutman nathan.rutman at oracle.com
Tue Aug 3 11:33:51 PDT 2010


On Aug 3, 2010, at 11:25 AM, Roger Spellman wrote:

> Nathan,
>  
> Thank you.   That works!
>  
> I found that if I change IP address, I also need to remove the file  /mnt/mdt/CONFIGS/*-client.

This is what tunefs.lustre --writeconf on the MDT does, when you first mount it after the writeconf.
--writeconf on the MDT and all OSTs is the preferred way of changing a server nid.
 
>  
> The reason is that the OST mounts failed – the OST was still looking for the old IP Address.  I grepped for files with the old IP Address, and I found those client files.
> 
> Is that a safe thing to do?  Please note that my mdt and mgs are on the same LUN.
>  
> Thanks.
>  
> -Roger
>  
>  
> From: Nathan Rutman [mailto:nathan.rutman at oracle.com] 
> Sent: Tuesday, August 03, 2010 2:03 PM
> To: Roger Spellman
> Cc: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Problem with write_conf
>  
> There's a 'failsafe' feature that  prevents filesystem name changes:
>> LustreError: 157-3: Trying to start OBD AFTER-MDT0000_UUID using the wrong disk BEFORE-MDT0000_UUID. Were the /dev/ assignments rearranged?
>> 
> You'll have to go and delete the last_rcvd file off the disk for all the servers in the filesystem as well as tunefs --writeconf them all to the name AFTER name.  
>  
> On Aug 2, 2010, at 6:08 PM, Roger Spellman wrote:
> 
> 
>  
> Hi,
> I would like to be able to change a file system name.  Towards that end, I have run the following commands as an experiment:
> 
>   mkfs.lustre --reformat --fsname BEFORE  --device-size=10000 --mgs --mdt  --mgsnode=10.2.9.1 at o2ib0 /dev/mapper/map0
>   dmesg -c
>   mount -t lustre /dev/mapper/map0 /mnt/mdt
>   dmesg -c
>   umount /mnt/mdt
>   dmesg -c
>   tunefs.lustre --writeconf --fsname=AFTER --mgs --mdt /dev/mapper/map0
>   dmesg -c
>   mount -t lustre /dev/mapper/map0 /mnt/mdt
>   dmesg -c
> 
> Unfortunately, this does not work.  Can someone please explain the correct sequence of commands to ues?  The output of each command is as follows.
> 
> Thanks.
> 
> [root at ts-hss2-01 ~]# mkfs.lustre --reformat --fsname BEFORE  --device-size=10000 --mgs --mdt  --mgsnode=10.2.9.1 at o2ib0 /dev/mapper/map0
> 
>    Permanent disk data:
> Target:     BEFORE-MDTffff
> Index:      unassigned
> Lustre FS:  BEFORE
> Mount type: ldiskfs
> Flags:      0x75
>               (MDT MGS needs_index first_time update )
> Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
> Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups
> 
> device size = 1632256MB
> 2 6 18
> formatting backing filesystem ldiskfs on /dev/mapper/map0
>         target name  BEFORE-MDTffff
>         4k blocks     2500
>         options        -i 4096 -I 512 -q -O dir_index,extents,uninit_groups -F
> mkfs_cmd = mke2fs -j -b 4096 -L BEFORE-MDTffff  -i 4096 -I 512 -q -O dir_index,extents,uninit_groups -F /dev/mapper/map0 2500
> Writing CONFIGS/mountdata
> [root at ts-hss2-01 ~]# dmesg -c
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1388, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
> LDISKFS-fs: mballoc: 1 extents scanned, 0 goal hits, 1 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 1 generated and it took 2142
> LDISKFS-fs: mballoc: 512 preallocated, 0 discarded
> 
> 
> [root at ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
> [root at ts-hss2-01 ~]# dmesg -c
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1406, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1410, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> Lustre: MGS MGS started
> Lustre: MGC10.2.9.1 at o2ib: Reactivating import
> Lustre: Setting parameter BEFORE-MDT0000.mdt.group_upcall in log BEFORE-MDT0000
> Lustre: Enabling user_xattr
> Lustre: BEFORE-MDT0000: new disk, initializing
> Lustre: BEFORE-MDT0000: Now serving BEFORE-MDT0000 on /dev/mapper/map0 with recovery enabled
> Lustre: 1503:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) BEFORE-MDT0000: group upcall set to /usr/sbin/l_getgroups
> Lustre: BEFORE-MDT0000.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
> 
> 
> [root at ts-hss2-01 ~]# umount /mnt/mdt
> [root at ts-hss2-01 ~]# dmesg -c
> Lustre: Failing over BEFORE-MDT0000
> Lustre: Skipped 1 previous similar message
> Lustre: *** setting obd BEFORE-MDT0000 device 'dm-4' read-only ***
> Turning device dm-4 (0xfd00004) read-only
> Lustre: BEFORE-MDT0000: shutting down for failover; client state will be preserved.
> Lustre: MDT BEFORE-MDT0000 has stopped.
> LustreError: 1517:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
> LustreError: 1517:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
> Lustre: MGS has stopped.
> LDISKFS-fs: mballoc: 3 blocks 3 reqs (0 success)
> LDISKFS-fs: mballoc: 8 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 1 generated and it took 2598
> LDISKFS-fs: mballoc: 1145 preallocated, 0 discarded
> Removing read-only on unknown block (0xfd00004)
> Lustre: server umount BEFORE-MDT0000 complete
> 
> 
> [root at ts-hss2-01 ~]# tunefs.lustre --writeconf --fsname=AFTER --mgs --mdt /dev/mapper/map0
> checking for existing Lustre data: found CONFIGS/mountdata
> Reading CONFIGS/mountdata
> 
>    Read previous values:
> Target:     BEFORE-MDT0000
> Index:      0
> Lustre FS:  BEFORE
> Mount type: ldiskfs
> Flags:      0x5
>               (MDT MGS )
> Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
> Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups
> 
> 
>    Permanent disk data:
> Target:     AFTER-MDT0000
> Index:      0
> Lustre FS:  AFTER
> Mount type: ldiskfs
> Flags:      0x105
>               (MDT MGS writeconf )
> Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
> Parameters: mgsnode=10.2.9.1 at o2ib mdt.group_upcall=/usr/sbin/l_getgroups
> 
> Writing CONFIGS/mountdata
> [root at ts-hss2-01 ~]# dmesg -c
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1539, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: recovery complete.
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
> LDISKFS-fs: mballoc: 6 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 1 generated and it took 2905
> LDISKFS-fs: mballoc: 506 preallocated, 0 discarded
> 
> 
> [root at ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
> mount.lustre: mount /dev/mapper/map0 at /mnt/mdt failed: Invalid argument
> This may have multiple causes.
> Are the mount options correct?
> Check the syslog for more info.
> [root at ts-hss2-01 ~]# dmesg -c
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1567, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> LDISKFS-fs: barriers enabled
> kjournald2 starting: pid 1575, dev dm-4:8, commit interval 5 seconds
> LDISKFS FS on dm-4, internal journal on dm-4:8
> LDISKFS-fs: delayed allocation enabled
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
> Lustre: MGS MGS started
> Lustre: MGC10.2.9.1 at o2ib: Reactivating import
> Lustre: MGS: Logs for fs AFTER were removed by user request.  All servers must be restarted in order to regenerate the logs.
> Lustre: Setting parameter AFTER-MDT0000.mdt.group_upcall in log AFTER-MDT0000
> Lustre: Enabling user_xattr
> LustreError: 157-3: Trying to start OBD AFTER-MDT0000_UUID using the wrong disk BEFORE-MDT0000_UUID. Were the /dev/ assignments rearranged?
> LustreError: 1665:0:(mds_fs.c:828:mds_fs_setup()) cannot read last_rcvd: rc = -22
> LustreError: 1665:0:(handler.c:2007:mds_setup()) AFTER-MDT0000: MDS filesystem method init failed: rc = -22
> LustreError: 1665:0:(obd_config.c:372:class_setup()) setup AFTER-MDT0000 failed (-22)
> LustreError: 1665:0:(obd_config.c:1199:class_config_llog_handler()) Err -22 on cfg command:
> Lustre:    cmd=cf003 0:AFTER-MDT0000  1:AFTER-MDT0000_UUID  2:0  3:AFTER-MDT0000 
> LustreError: 15b-f: MGC10.2.9.1 at o2ib: The configuration from log 'AFTER-MDT0000' failed (-22). Make sure this client and the MGS are running compatible versions of Lustre.
> LustreError: 15c-8: MGC10.2.9.1 at o2ib: The configuration from log 'AFTER-MDT0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
> LustreError: 1566:0:(obd_mount.c:1124:server_start_targets()) failed to start server AFTER-MDT0000: -22
> LustreError: 1566:0:(obd_mount.c:1653:server_fill_super()) Unable to start targets: -22
> LustreError: 1566:0:(obd_config.c:443:class_cleanup()) Device 4 not setup
> LustreError: 1566:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
> LustreError: 1566:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
> Lustre: MGS has stopped.
> LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
> LDISKFS-fs: mballoc: 6 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
> LDISKFS-fs: mballoc: 1 generated and it took 2883
> LDISKFS-fs: mballoc: 503 preallocated, 0 discarded
> Lustre: 1566:0:(obd_mount.c:1473:server_put_super()) Cleaning orphaned obd AFTER-mdtlov
> Lustre: server umount AFTER-MDT0000 complete
> LustreError: 1566:0:(obd_mount.c:2045:lustre_fill_super()) Unable to mount  (-22)
> 
> Roger Spellman
> Staff Engineer
> Terascala, Inc.
> 508-588-1501
> www.terascala.com <http://www.terascala.com/>
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100803/7725f4f1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1931 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100803/7725f4f1/attachment.bin>


More information about the lustre-discuss mailing list