[Lustre-discuss] Problem re-mounting Lustre on an other node

Wed Oct 14 04:47:17 PDT 2009

On Wednesday 14 October 2009, Michael Schwartzkopff wrote:
> Hi,
> 
> we have a Lustre 1.8 Cluster with openais and pacemaker as the cluster
> manager. When I migrate one lustre resource from one node to an other node
>  I get an error. Stopping lustre on one node is no problem, but the node
>  where lustre should start says:
> 
> Oct 14 09:54:28 sososd6 kernel: kjournald starting.  Commit interval 5
>  seconds Oct 14 09:54:28 sososd6 kernel: LDISKFS FS on dm-4, internal
>  journal Oct 14 09:54:28 sososd6 kernel: LDISKFS-fs: recovery complete.
> Oct 14 09:54:28 sososd6 kernel: LDISKFS-fs: mounted filesystem with ordered
> data mode.
> Oct 14 09:54:28 sososd6 multipathd: dm-4: umount map (uevent)
> Oct 14 09:54:39 sososd6 kernel: kjournald starting.  Commit interval 5
>  seconds Oct 14 09:54:39 sososd6 kernel: LDISKFS FS on dm-4, internal
>  journal Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: mounted filesystem
>  with ordered data mode.
> Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: file extents enabled
> Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: mballoc enabled
> Oct 14 09:54:39 sososd6 kernel: Lustre: MGC134.171.16.190 at tcp: Reactivating

[...]

> 
> These log continue until the cluster software times out and the resource
>  tells me about the error. Any help understanding these logs? Thanks.
> 

What is your start timeout? Do you see mount in the process list? I guess you 
just need to increase the timeout, I usually set at least 10 minutes, 
sometimes even 20 minutes. Also see my bug report and if possible add further 
information yourself.

https://bugzilla.lustre.org/show_bug.cgi?id=20402

Thanks,
Bernd

-- 
Bernd Schubert
DataDirect Networks