[Lustre-discuss] NID change failure

Dan dan at nerp.net
Mon Jul 20 04:02:28 PDT 2009


Hi all,

We were migrating our MDS to a new machine (done this before
successfully on another file system).  I formated the new RAID 10 with
1.6.7.2, up from 1.6.5.1 on the old MDS.  Copied all files and ran
get/setfattr, then deleted CATALOG and OBJECTS/*.  It mounted w/o errors
so I pointed the OSTs at the new MDS with:

tunefs.lustre --erase-param --mgsnode=192.168.0.75 at tcp
--writeconf /dev/sdb

The OSTs are still running 1.6.4.3 (problem here with this?) but mount
and recover as expected, logs how they're online.  When I mount the file
system on a client I see an error in the logs about
mds_uuid=192.168.0.92 (the old IP address .75 is the new machine).  The
mount command hangs and never completes with this error reporting
to /var/log/messages every 30 seconds or so.  

I've taken a good hard look at the manual and searched the mailing list
- our file system is still down, please help.  Thank you,

Dan




More information about the lustre-discuss mailing list