[Lustre-discuss] how to replace a bad OST.
Andreas Dilger
adilger at sun.com
Tue Mar 18 16:08:15 PDT 2008
On Mar 17, 2008 11:29 -0600, Lundgren, Andrew wrote:
> I am trying to learn how to replace a defective OST with a new one.
> Assuming the old OST can not be salvaged.
>
> I have a test cluster that I am working on.
>
> I deactivated the volume on the MGS using:
>
> lctl conf_param content-OST0002-osc.osc.active=0
>
> I unlinked all of the bad files by finding the ones on the bad volume.
>
> I formatted a fresh OST using the index number of the bad device:
>
> mkfs.lustre --reformat --fsname content --ost --mgsnode=4.248.52.81 at tcp0 --param="failover.mode=failout" --index=02 /dev/md6
You do not necessarily want to add the new OST in the same slot as the
old one. There are a few compilcations with doing that, in particular:
- the MDS will think that new OST has objects up to what the old OST
had, and when the new OST is first started it will recreate them.
That will take a long time, and waste a lot of space on the OST, maybe
all of the inodes in the whole filesystem
- if you missed removing some of the bad files by accident, they will
think that the new OST is the same as the old one. Not fatal, but
you would probably prefer to get an IO error back instead of just
a zero-length file.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list