[lustre-discuss] Rebuild server

Cowe, Malcolm J malcolm.j.cowe at intel.com
Thu Mar 10 22:05:38 PST 2016

If one assumes that the rebuild will incorporate the same identity as the original host (same hostname, IP address, etc.), then it should just be a matter of restoring the OS, re-installing the Lustre packages, configuring LNet (e.g. /etc/modprobe.d/lustre.conf) and remounting. If you've got an HA setup (e.g. Pacemaker + Corosync), then you'll need to restore that as well. Or rather, keep a backup copy of the config so that you can restore it :). There is no need to perform any "rebuild" of Lustre itself; just repair/restore the OS.

Other than LNet, all the Lustre configuration information is held on the storage targets (MGT, MDT), so you can rebuild the root disks without affecting the Lustre config on the MGT and MDT.

So, in summary: rebuild the root disks (maybe use a provisioning system like kickstart for repeatability), restore the network config, restore LNet config, maybe restore the HA software, restore the identity management (e.g. LDAP, passwd, group) then mount the storage as before.

Malcolm Cowe
High Performance Data Division
Intel Corporation | www.intel.com

-----Original Message-----
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Jon Tegner
Sent: Friday, March 11, 2016 4:48 PM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] Rebuild server


yesterday I had an incident where the system disk of one of my servers 
(MDT/MGS) went down, but the raid could be rebuilt and the system went 
up again.

However, in the event of a complete failure of the system disk (assuming 
all relevant "lustre disks" are still intact) is there a clear procedure 
to follow in order to rebuild the file system once the OS has been 
reinstalled on new disk?


lustre-discuss mailing list
lustre-discuss at lists.lustre.org

More information about the lustre-discuss mailing list