[Lustre-discuss] Has anyone had experience with heartbeat and drdb providing full redundancy on lustre clusters

Thomas Roth t.roth at gsi.de
Thu Apr 23 05:57:34 PDT 2009


Hi,

we are using Heartbeat+DRBD on our MGS/MDT.
DRBD work fine, we have been using it for backing up the MDT, upgrading
the Lustre version, and of course for failover.
Heartbeat proves to be much trickier. Since our network is rather shaky,
we are suffering from late heartbeats, Heartbeat trying to fail over for
no apparent reason, Stonith for no good reason... In addition, if one
umounts the MDT, it always starts with a delay of 330sec, and Hearbeat
always gives up on the resource MDT after 20000ms - no matter what I put
into the cib.xml, no matter the Lustre timeouts. So one always needs to
force the umount or more likely a reboot - not a problem with a reliable
Stonith-procedure ;-)

Thomas

Christopher Deneen wrote:
> trying to get a feel if it's worth investing time to implement.
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
D-64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführer: Professor Dr. Horst Stöcker

Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph,
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt



More information about the lustre-discuss mailing list