[Lustre-discuss] recovery_status inactive
Papp Tamas
tompos at martos.bme.hu
Thu Oct 2 00:51:45 PDT 2008
Wojciech Turek wrote:
> Hi,
>
> COMPLETE means that this particular OST was in recovery and recovery is
> now finished.
> To force recovery just unmount OST and then mount it again. If unmounted
> OST had any clients connected after mounting it back it will start
> recovery process to let all the clients reconnect to it. When OST is in
> recovery status it will refuse all new connections from the clients
> which means that file system that this OST is a part of will not be
> accessible until recovery finishes. Recovery will finish either when all
> previously connected clients will reconnect or it will timout after a
> certain amount of time. If one of the clients that was connected to the
> OST will crash or loose power etc. before it will get a chance to
> reconnect then recovery will have to time. If you know that OST will not
> recover all previously connected clients because one of them isn't there
> any more you can avoid waiting for recovery to timeout and you can abort
> recovery manually.
> lctl --device <OST_device_number> abort_recovery
> You can find OST_device_number by running 'lctl dl' command
> You will see line like this
> 7 UP obdfilter ddn_data-OST0009 ddn_data-OST0009_UUID 1159
> Number 7 is the number of the OST device.
>
> All this is in the lustre operation manual, so please read it.
>
Of course I've read the manual many times.
The problem is not with COMPLETE recovery_status, but INACTIVE.
I haven't found any info about it in the manual.
I wanted to force the recovery without unmounting the OST just for give
a try.
Anyway it seems to working right now, I hope, it's OK.
Thanks,
tamas
More information about the lustre-discuss
mailing list