[Lustre-discuss] recovery_status inactive

Wojciech Turek wjt27 at cam.ac.uk
Tue Sep 30 07:58:12 PDT 2008


Hi,

COMPLETE means that this particular OST was in recovery and recovery is 
now finished.
To force recovery just unmount OST and then mount it again. If unmounted 
OST had any clients connected after mounting it back it will start 
recovery process to let all the clients reconnect to it. When OST is in 
recovery status it will refuse all new connections from the clients 
which means that file system that this OST is a part of will not be 
accessible until recovery finishes. Recovery will finish either when all 
previously connected clients will reconnect or it will timout after a 
certain amount of time. If one of the clients that was connected to the 
OST will crash or loose power etc. before it will get a chance to 
reconnect then recovery will have to time. If you know that OST will not 
recover all previously connected clients because one of them isn't there 
any more you can avoid waiting for recovery to timeout and you can abort 
recovery manually.
lctl --device <OST_device_number> abort_recovery
You can find OST_device_number by running 'lctl dl' command
You will see line like this
  7 UP obdfilter ddn_data-OST0009 ddn_data-OST0009_UUID 1159
Number 7 is the number of the OST device.

All this is in the lustre operation manual, so please read it.

Cheers

Wojciech  

Papp Tamás wrote:
> James Braid wrote:
>   
>> 2008/9/30 Papp Tamás <tompos at martos.bme.hu>:
>>   
>>     
>>> What does this mean?
>>>
>>> # cat /proc/fs/lustre/mds/storage-MDT0000/recovery_status
>>> status: INACTIVE
>>>     
>>>       
>> It's normal, it just means recovery is not running (because it's
>> finished or been aborted or whatever)
>>   
>>     
>
> This is another one. This one has no problems ever. Don't should it look 
> like this?
>
> # cat /proc/fs/lustre/mds/archive-MDT0000/recovery_status
> status: COMPLETE
> recovery_start: 1221309004
> recovery_end: 1221309469
> recovered_clients: 1
> unrecovered_clients: 0
> last_transno: 895550056
> replayed_requests: 0
>
>
>
> How can I force the recovery just for test?
>
> tamas
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>   

-- 
Wojciech Turek

Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517 




More information about the lustre-discuss mailing list