<HTML>

<HEAD>

<TITLE>Remounting OSTs on other servers</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Lustre Admins,<BR>

<BR>

We are currently in the process of upgrading our Redhat-based 1.6.7.1 lustre setup.  Previously we had one lustre server which acted as the MGS, MDS and OSS for a number of unpatched redhat lustre clients.  We have 3 distinct lustre filesystems – each with a MDT partition/LUN and a number of OST partitions/LUNs – all accessed over a fibre-channel SAN.<BR>

<BR>

We now have 3 extra nodes, and are planning to stripe the OSTs of the various filesystems over these 3 OSS’s and retain the 3 MDT’s on the original server which will become purely an MGS/MDS.<BR>

We are not yet ready to implement automatic failover mechanisms but wish to be able to manually failover  OST’s and MDT’s between servers in the event of server failure or even for server maintenance.<BR>

<BR>

In preparation for this, I have been testing with a new filesystem – but am unable to mount an OST on an arbitary server node if it has previously been mounted on another node.<BR>

<BR>

For example – assuming our lustre MDS is called lustre-mds1 and our 3 OSS’s are lustre-oss1, lustre-oss2 and lustre-oss3  - I can create the new filesystem with an MDT mounted on lustre-mds1 and the OSTs mounted on lustre-oss1 and a client can successfully mount the filesystem.  When I unmount the client, then the OSTs and remount them on lustre-oss2 – the client can mount the filesystem but cannot access the files and there is no metadata available (see below).  There are errors on the MDS also (see below):<BR>

<BR>

[Client]<BR>

[root@node8 ~]# ls -l /mnt/lustre/test<BR>

total 0<BR>

?--------- ? ? ? ?            ? testfile-node8<BR>

[root@node8 ~]#<BR>

<BR>

<BR>

[MGS/MDS /var/log/messages]:<BR>

Jun  2 17:14:14 lustre1 kernel: LustreError: 6481:0:(socklnd_cb.c:2156:ksocknal_recv_hello()) Error -104 reading HELLO from 130.102.xxx.xxx<BR>

Jun  2 17:14:14 lustre1 kernel: LustreError: 6481:0:(socklnd_cb.c:2156:ksocknal_recv_hello()) Skipped 17 previous similar messages<BR>

Jun  2 17:15:29 lustre1 kernel: LustreError: 11b-b: Connection to 130.102.xxx.xxx@tcp at host 130.102.xxx.xxx on port 988 was reset: is it running a compatible version of Lustre and is 130.102.xxx.xxx@tcp one of its NIDs?<BR>

Jun  2 17:15:29 lustre1 kernel: LustreError: Skipped 17 previous similar messages<BR>

<BR>

(Where the IP was that of the OSS currently mounting the OSTs.  All servers and clients are running 64-bit RHEL5 with lustre 1.6.7.1.)<BR>

<BR>

The same occurs when the OSTs are mounted on lustre-oss3.<BR>

<BR>

However when I remount the OSTs on lustre-oss1, the client can suddenly see the files again:<BR>

<BR>

<BR>

[root@node8 ~]# ls -l /mnt/lustre/test<BR>

total 4<BR>

-rw-r--r-- 1 root root 6 Jun  2 16:29 testfile-node8<BR>

[root@node8 ~]# <BR>

<BR>

<BR>

<BR>

It seems that that if I perform a tunfs.lustre —writeconf on both the MDT  and the OSTs of the filesystem – then I can remount the OST’s on a new server and the client can see them.  Of course, I cannot later mount them on another server unless I perform the tunefs.lustre/writeconf again.  <BR>

<BR>

This behaviour with tunefs.lustre —writeconf was not always consistent and on one occasion the filesystem became unmountable anywhere (no matter which OSS mounted it, the clients failed to complete the mount) – but when the filesystem was recreated (ie mkfs.lustre —reformat . . . ) the behaviour I described above is again reproducible (for now).  Note that I could not reboot the lustre-mds (MDS) server or restart various services on it as it is currently in production.<BR>

<BR>

<BR>

Is this as it should be, or is there a better way to be able to failover LUNs between servers – preferably while keeping the filesystem mounted and available for clients.<BR>

<BR>

<BR>

** I did see section 4.2.9 - 4.2.11 in the lustre manual (May 2009) - but I am not changing any server NIDs, or MGS locations (in fact the MGS is on its own LUN) and found that merely running the writeconf on the MDT LUN resulted in errors when mounting the OST’s on the OSS’s (see below), and I was hoping that there was a way to move the OST LUNs dynamically without unmounting all clients and servers:<BR>

<BR>

Jun  2 16:58:02 lustre1 kernel: LustreError: 13b-9: test2-OST0000 claims to have registered, but this MGS does not know about it, preventing registration.<BR>

<BR>

<BR>

<BR>

<BR>

Thanks in advance for your advice you may have or pointers to documentation I have overlooked.<BR>

<BR>

Regards,<BR>

<BR>

<BR>

Marcus.<BR>

<BR>

Marcus Schull<BR>

Systems Administrator<BR>

IMB, University of Queensland.<BR>

</SPAN></FONT>

</BODY>

</HTML>