[lustre-discuss] failed OST recover
serg at parallel.ru
Mon Nov 30 22:44:30 PST 2020
> Many years ago when I was using Lustre-1.8.X, I used to suffer the
> same nightmare as you now. The following procedure saved me. But
> I am not sure whether it works to you or not.
Thank you! I had found this recipe, but in new lustre versions it
does not work, ll_recover_lost_found_objs does not exists any more. I
have 2.12.2 installed.
As I understand, its function is integrated into lfsck procedure now.
But it does not work as I expect.
Can anybody give me a clue how to force this procedure? Should I stop
all clients and do lsfck with enabled broken OST? I do not want to
experiment, while I have tens of users and one week of lustre
unavailability without significant results looks very bad for me.
> 1. umount all the clients, umount OST.
> 2. mount OST as ldiskfs:
> mount -t ldiskfs /dev/<OST_device> /mnt
> 3. Run the command:
> ll_recover_lost_found_objs -d <lost+found_dir>
> At that event it restored about 70% of data back.
> In case that you want to remove the files which were lost in OST, but
> unfortunately using "rm -f <filename>" does not work:
> 1. Record the full paths of the files which you want to remove.
> 2. umount all client, OST, and MDT.
> 3. Mount MDT as ldiskfs:
> mount -t ldiskfs /dev/<MDT_device> /mnt
> 4. Go to /mnt/ROOT/. You will find the completed directory tree of
> your Lustre file system, but without the file contents. You can
> remove the files you want from here.
> On Mon, Nov 30, 2020 at 01:09:07PM +0300, Sergey Zhumatiy wrote:
>> Please, help to resolve... One ost on my lustre installation has been
>> failed. It lost all fs metadatam so I couldn't mount it as lustre
>> filesystem. I've checked it by e2fsck and all data was moved into lost+found
>> folder. Then I moved this folder to another storage, re-created this ost
>> (with old target index), then put back lost+found folder.
>> After mount this ost lustre, I've started lfsck on mds. In several hours I
>> disabled this ost, because no client can work. Then lustre become heathy,
>> and I started lfs_migrate from this ost.
>> But it seems, that data was not restored by lfsck and lfs_migrate moved a
>> few of files and the rest is 'endpoint not connected'.
>> How can I restore some data and delete unrecoverable data?
>> With respect
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
More information about the lustre-discuss