[Lustre-discuss] recovery_status inactive

Papp Tamas tompos at martos.bme.hu
Thu Oct 2 00:51:45 PDT 2008


Wojciech Turek wrote:
> Hi,
>
> COMPLETE means that this particular OST was in recovery and recovery is 
> now finished.
> To force recovery just unmount OST and then mount it again. If unmounted 
> OST had any clients connected after mounting it back it will start 
> recovery process to let all the clients reconnect to it. When OST is in 
> recovery status it will refuse all new connections from the clients 
> which means that file system that this OST is a part of will not be 
> accessible until recovery finishes. Recovery will finish either when all 
> previously connected clients will reconnect or it will timout after a 
> certain amount of time. If one of the clients that was connected to the 
> OST will crash or loose power etc. before it will get a chance to 
> reconnect then recovery will have to time. If you know that OST will not 
> recover all previously connected clients because one of them isn't there 
> any more you can avoid waiting for recovery to timeout and you can abort 
> recovery manually.
> lctl --device <OST_device_number> abort_recovery
> You can find OST_device_number by running 'lctl dl' command
> You will see line like this
>   7 UP obdfilter ddn_data-OST0009 ddn_data-OST0009_UUID 1159
> Number 7 is the number of the OST device.
>
> All this is in the lustre operation manual, so please read it.
>   

Of course I've read the manual many times.
The problem is not with COMPLETE recovery_status, but INACTIVE.
I haven't found any info about it in the manual.

I wanted to force the recovery without unmounting the OST just for give 
a try.
Anyway it seems to working right now, I hope, it's OK.

Thanks,

tamas




More information about the lustre-discuss mailing list