[Lustre-discuss] NID change failure

Andreas Dilger adilger at sun.com
Mon Jul 20 15:09:15 PDT 2009


On Jul 20, 2009  04:02 -0700, Dan wrote:
> We were migrating our MDS to a new machine (done this before
> successfully on another file system).  I formated the new RAID 10 with
> 1.6.7.2, up from 1.6.5.1 on the old MDS.  Copied all files and ran
> get/setfattr, then deleted CATALOG and OBJECTS/*.  It mounted w/o errors
> so I pointed the OSTs at the new MDS with:
> 
> tunefs.lustre --erase-param --mgsnode=192.168.0.75 at tcp
> --writeconf /dev/sdb
> 
> The OSTs are still running 1.6.4.3 (problem here with this?) but mount
> and recover as expected, logs how they're online.  When I mount the file
> system on a client I see an error in the logs about
> mds_uuid=192.168.0.92 (the old IP address .75 is the new machine).  The
> mount command hangs and never completes with this error reporting
> to /var/log/messages every 30 seconds or so.  
> 
> I've taken a good hard look at the manual and searched the mailing list
> - our file system is still down, please help.  Thank you,

You need to run a --writeconf on the MDS also.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list