[Lustre-discuss] Recommended failover software for Lustre

Christopher J.Walker C.J.Walker at qmul.ac.uk
Mon Jul 16 04:23:15 PDT 2012


The "configuring failover" section in the Whamcloud release of the
Lustre manual seems rather out of date:

http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#configuringfailover

The Oracle release says much the same thing:
http://wiki.lustre.org/manual/LustreManual20_HTML/ConfiguringFailover.html#50540588_50628

In section 11.1.1 "Power management software", it says:

"For more information about PowerMan, go to:
https://computing.llnl.gov/linux/powerman.html"

Which no longer exists. It should probably point at
http://code.google.com/p/powerman/


Then in section 11.2. "Setting up High-Availability (HA) Software with
Lustre" it mentions "Red Hat Cluster Manager"  and "Pacemaker".

"Red Hat Cluster Manager" points to
http://wiki.lustre.org/index.php/Using_Red_Hat_Cluster_Manager_with_Lustre

which says "In comparison with other HA solutions, RedHat Cluster as in
RHEL 5.5 is an old HA solution. We recommend using other HA solutions
like Pacemaker, if possible. "

The pacemaker link:
http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre

Although the title of this is "Using Pacemaker with Lustre", it starts
off by saying "In modern clusters, OpenAIS, or more specifically, its
communication stack corosync, is used for this task".


In summary:

1) The manual could do with some updating here.

2) I suspect I should be using corosync.

Chris






More information about the lustre-discuss mailing list