[lustre-discuss] Coordinating cluster start and shutdown?

Bertschinger, Thomas Andrew Hjorth bertschinger at lanl.gov
Wed Dec 6 08:00:38 PST 2023


Hello Jan,

You can use the Pacemaker / Corosync high-availability software stack for this: specifically, ordering constraints [1] can be used.

Unfortunately, Pacemaker is probably over-the-top if you don't need HA -- its configuration is complex and difficult to get right, and it significantly complicates system administration. One downside of Pacemaker is that it is not easy to decouple the Pacemaker service from the Lustre services, meaning if you stop the Pacemaker service, it will try to stop all of the Lustre services. This might make it inappropriate for use cases that don't involve HA.

Given those downsides, if others in the community have suggestions on simpler means to accomplish this, I'd love to see other tools that can be used here (especially officially supported ones, if they exist).

[1] https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/constraints.html#specifying-the-order-in-which-resources-should-start-stop

- Thomas Bertschinger

________________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Jan Andersen <jan at comind.io>
Sent: Wednesday, December 6, 2023 3:27 AM
To: lustre
Subject: [EXTERNAL] [lustre-discuss] Coordinating cluster start and shutdown?

Are there any tools for coordinating the start and shutdown of lustre filesystem, so that the OSS systems don't attempt to mount disks before the MGT and MDT are online?
_______________________________________________


More information about the lustre-discuss mailing list