[lustre-discuss] Problems moving an OSS from an old Lustre installation to a new one

Oliver Mangold Oliver.Mangold at EMEA.NEC.COM
Mon Jul 27 22:38:20 PDT 2015


On 28.07.2015 07:17, Massimo Sgaravatto wrote:
>
>
> Some interesting messages found in the syslog of the "moved" OSS:
>
> Jul 24 14:56:25 t2-oss-03 kernel: Lustre: cmswork-OST0003: Received
> MDS connection from 10.60.16.8 at tcp, removing former export from
> 10.60.16.38 at tcp
>
> Jul 24 14:56:27 t2-oss-03 kernel: Lustre: cmswork-OST0003: already
> connected client cmswork-MDT0000-mdtlov_UUID \
> (at 10.60.16.8 at tcp) with handle 0xdb376ec08bf7d020. Rejecting client
> with the same UUID trying to reconnect with\
>  handle 0x6dffb49bb9b3bc70
>
> 10.60.16.8 is the IP of the old MDS
> 10.60.16.38 is the IP of the new MDS
>
>
> For the the being we disabled the OSTs hosted on the "moved" OSS so
> that new objects are not written there.
>
>
> Any idea what the problem is and how we could recover the system ?
>
Do I see it correctly, that the old MGS/MDS is still up and running? I
understand it that way, that it still tries to find a OST at
10.60.16.9 at tcp (that info is stored in the llog on the MGS). But I'm
confused also, why it should think that the new OST is the one it is
looking for. It  has a new UUID, so it should be detected. Anyway, I
would first shutdown the old MGS/MDS before I tried to write any more
data to the new OST.

-- 
Dr. Oliver Mangold
System Analyst
NEC Deutschland GmbH
HPC Division
Raiffeisenstraße 14
70771 Leinfelden-Echterdingen
Germany
Phone: +49 711 78055 13
Mail: oliver.mangold at emea.nec.com



More information about the lustre-discuss mailing list