[Lustre-discuss] Problem re-mounting Lustre on an other node
Bernd Schubert
bs_lists at aakef.fastmail.fm
Wed Oct 14 04:47:17 PDT 2009
On Wednesday 14 October 2009, Michael Schwartzkopff wrote:
> Hi,
>
> we have a Lustre 1.8 Cluster with openais and pacemaker as the cluster
> manager. When I migrate one lustre resource from one node to an other node
> I get an error. Stopping lustre on one node is no problem, but the node
> where lustre should start says:
>
> Oct 14 09:54:28 sososd6 kernel: kjournald starting. Commit interval 5
> seconds Oct 14 09:54:28 sososd6 kernel: LDISKFS FS on dm-4, internal
> journal Oct 14 09:54:28 sososd6 kernel: LDISKFS-fs: recovery complete.
> Oct 14 09:54:28 sososd6 kernel: LDISKFS-fs: mounted filesystem with ordered
> data mode.
> Oct 14 09:54:28 sososd6 multipathd: dm-4: umount map (uevent)
> Oct 14 09:54:39 sososd6 kernel: kjournald starting. Commit interval 5
> seconds Oct 14 09:54:39 sososd6 kernel: LDISKFS FS on dm-4, internal
> journal Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: mounted filesystem
> with ordered data mode.
> Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: file extents enabled
> Oct 14 09:54:39 sososd6 kernel: LDISKFS-fs: mballoc enabled
> Oct 14 09:54:39 sososd6 kernel: Lustre: MGC134.171.16.190 at tcp: Reactivating
[...]
>
> These log continue until the cluster software times out and the resource
> tells me about the error. Any help understanding these logs? Thanks.
>
What is your start timeout? Do you see mount in the process list? I guess you
just need to increase the timeout, I usually set at least 10 minutes,
sometimes even 20 minutes. Also see my bug report and if possible add further
information yourself.
https://bugzilla.lustre.org/show_bug.cgi?id=20402
Thanks,
Bernd
--
Bernd Schubert
DataDirect Networks
More information about the lustre-discuss
mailing list