[lustre-discuss] Rebuild server
Cowe, Malcolm J
malcolm.j.cowe at intel.com
Thu Mar 10 22:05:38 PST 2016
If one assumes that the rebuild will incorporate the same identity as the original host (same hostname, IP address, etc.), then it should just be a matter of restoring the OS, re-installing the Lustre packages, configuring LNet (e.g. /etc/modprobe.d/lustre.conf) and remounting. If you've got an HA setup (e.g. Pacemaker + Corosync), then you'll need to restore that as well. Or rather, keep a backup copy of the config so that you can restore it :). There is no need to perform any "rebuild" of Lustre itself; just repair/restore the OS.
Other than LNet, all the Lustre configuration information is held on the storage targets (MGT, MDT), so you can rebuild the root disks without affecting the Lustre config on the MGT and MDT.
So, in summary: rebuild the root disks (maybe use a provisioning system like kickstart for repeatability), restore the network config, restore LNet config, maybe restore the HA software, restore the identity management (e.g. LDAP, passwd, group) then mount the storage as before.
High Performance Data Division
Intel Corporation | www.intel.com
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Jon Tegner
Sent: Friday, March 11, 2016 4:48 PM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] Rebuild server
yesterday I had an incident where the system disk of one of my servers
(MDT/MGS) went down, but the raid could be rebuilt and the system went
However, in the event of a complete failure of the system disk (assuming
all relevant "lustre disks" are still intact) is there a clear procedure
to follow in order to rebuild the file system once the OS has been
reinstalled on new disk?
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
More information about the lustre-discuss