[Lustre-discuss] How To change server recovery timeout

Fri Nov 9 15:28:24 PST 2007

Wojciech Turek wrote:
> Hi,
>
> It is  a lesson for me to do not change old habits. I always used "_" 
> and for latest filesystem I did exception for the impression that it 
> looks neater with "-" and here we go.
> Can I change file system name without reformatting everything? File 
> system with bad name is in production and it is essential for me to 
> fix it without long service downtime.

Yes, but you will have to shut everything down.  tunefs --writeconf all 
the servers, restart the MGS first.  While you're at it, you can set the 
timeout.  (This can be overridden later with conf_param). 

tunefs.lustre --writeconf --param="sys.timeout=50" /dev/sda

>
> Thanks
>
> Wojciech Turek
>
> On 8 Nov 2007, at 19:04, Nathan Rutman wrote:
>
>> Nathan Rutman wrote:
>>> Wojciech Turek wrote:
>>>
>>>   
>>>
>>>> On 7 Nov 2007, at 22:31, Nathan Rutman wrote:
>>>>
>>>>     
>>>>
>>>>> Cliff White wrote:
>>>>>
>>>>>       
>>>>>
>>>>>> Wojciech Turek wrote:
>>>>>>
>>>>>>   
>>>>>>
>>>>>>         
>>>>>>
>>>>>>> Hi Cliff,
>>>>>>>
>>>>>>> On 7 Nov 2007, at 17:58, Cliff White wrote:
>>>>>>>
>>>>>>>     
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>> Wojciech Turek wrote:
>>>>>>>>
>>>>>>>>       
>>>>>>>>
>>>>>>>>             
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> Our lustre environment is:
>>>>>>>>> 2.6.9-55.0.9.EL_lustre.1.6.3smp
>>>>>>>>> I would like to change recovery timeout from default value 
>>>>>>>>> 250s to something longer
>>>>>>>>> I tried example from manual:
>>>>>>>>> set_timeout <secs> Sets the timeout (obd_timeout) for a server
>>>>>>>>> to wait before failing recovery.
>>>>>>>>> We performed that experiment on our test lustre installation 
>>>>>>>>> with one OST.
>>>>>>>>> storage02 is our OSS
>>>>>>>>> [root at storage02 ~]# lctl dl
>>>>>>>>>   0 UP mgc MGC10.143.245.3 at tcp 
>>>>>>>>> 31259d9b-e655-cdc4-c760-45d3df426d86 5
>>>>>>>>>   1 UP ost OSS OSS_uuid 3
>>>>>>>>>   2 UP obdfilter home-md-OST0001 home-md-OST0001_UUID 7
>>>>>>>>> [root at storage02 ~]# lctl --device 2 set_timeout 600
>>>>>>>>> set_timeout has been deprecated. Use conf_param instead.
>>>>>>>>> e.g. conf_param lustre-MDT0000 obd_timeout=50
>>>>>>>>>
>>>>>>>>>         
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>
>>>>> sorry about this bad help message.  It's wrong.
>>>>>
>>>>>       
>>>>>
>>>>>>>>> usage: conf_param obd_timeout=<secs>
>>>>>>>>> run <command> after connecting to device <devno>
>>>>>>>>> --device <devno> <command [args ...]>
>>>>>>>>> [root at storage02 ~]# lctl --device 1 conf_param obd_timeout=600
>>>>>>>>> No device found for name MGS: Invalid argument
>>>>>>>>> error: conf_param: No such device
>>>>>>>>> It looks like I need to run this command from MGS node so I  
>>>>>>>>> moved then to MGS server called storage03
>>>>>>>>> [root at storage03 ~]# lctl dl
>>>>>>>>>   0 UP mgs MGS MGS 9
>>>>>>>>>   1 UP mgc MGC10.143.245.3 at tcp 
>>>>>>>>> f51a910b-a08e-4be6-5ada-b602a5ca9ab3 5
>>>>>>>>>   2 UP mdt MDS MDS_uuid 3
>>>>>>>>>   3 UP lov home-md-mdtlov home-md-mdtlov_UUID 4
>>>>>>>>>   4 UP mds home-md-MDT0000 home-md-MDT0000_UUID 5
>>>>>>>>>   5 UP osc home-md-OST0001-osc home-md-mdtlov_UUID 5
>>>>>>>>> [root at storage03 ~]# lctl device 5
>>>>>>>>> [root at storage03 ~]# lctl conf_param obd_timeout=600
>>>>>>>>> error: conf_param: Function not implemented
>>>>>>>>> [root at storage03 ~]# lctl --device 5 conf_param obd_timeout=600
>>>>>>>>> error: conf_param: Function not implemented
>>>>>>>>> [root at storage03 ~]# lctl help conf_param
>>>>>>>>> conf_param: set a permanent config param. This command must be 
>>>>>>>>> run on the MGS node
>>>>>>>>> usage: conf_param <target.keyword=val> ...
>>>>>>>>> [root at storage03 ~]# lctl conf_param 
>>>>>>>>> home-md-MDT0000.obd_timeout=600
>>>>>>>>> error: conf_param: Invalid argument
>>>>>>>>> [root at storage03 ~]#
>>>>>>>>> I searched whole /proc/*/lustre for file that can store this 
>>>>>>>>> timeout value but nothing were found.
>>>>>>>>> Could someone advise how to change value for recovery timeout?
>>>>>>>>> Cheers,
>>>>>>>>> Wojciech Turek
>>>>>>>>>
>>>>>>>>>         
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>
>>>>>>>> It looks like your file system is named 'home' - you can 
>>>>>>>> confirm with
>>>>>>>> tunefs.lustre --print <MDS device> | grep "Lustre FS"
>>>>>>>>
>>>>>>>> The correct command (Run on the MGS) would be
>>>>>>>> # lctl conf_param home.sys.timeout=<val>
>>>>>>>>
>>>>>>>> Example:
>>>>>>>> [root at ft4 ~]# tunefs.lustre --print /dev/sdb |grep "Lustre FS"
>>>>>>>> Lustre FS:  lustre
>>>>>>>> [root at ft4 ~]# cat /proc/sys/lustre/timeout
>>>>>>>> 130
>>>>>>>> [root at ft4 ~]# lctl conf_param lustre.sys.timeout=150
>>>>>>>> [root at ft4 ~]# cat /proc/sys/lustre/timeout
>>>>>>>> 150
>>>>>>>>
>>>>>>>>       
>>>>>>>>
>>>>>>>>             
>>>>>>>>
>>>>>>> Thanks for your email. I am afraid your tips aren't very helpful 
>>>>>>> in this case. As stated in the subject I am asking about 
>>>>>>> recovery timeout.
>>>>>>> You can find it for example in 
>>>>>>> /proc/fs/lustre/obdfilter/<OST>/recovery_status whilst one of 
>>>>>>> your OST's is in recovery state. By default this timeout is 250s.
>>>>>>> Whereas you are talking about system obd timeout (according to 
>>>>>>> CFS documentation chapter 4.1.2 ) which is not a subject of my 
>>>>>>> concern.
>>>>>>>
>>>>>>> Any way I tried your example just to see if it works and again I 
>>>>>>> am afraid it doesn't work for me, see below:
>>>>>>> I have combined mgs and mds configuration.
>>>>>>>
>>>>>>> [[root at storage03 ~]# df
>>>>>>> Filesystem           1K-blocks      Used Available Use% Mounted on
>>>>>>> /dev/sda1             10317828   3452824   6340888  36% /
>>>>>>> /dev/sda6              7605856     49788   7169708   1% /local
>>>>>>> /dev/sda3              4127108     41000   3876460   2% /tmp
>>>>>>> /dev/sda2              4127108    753668   3163792  20% /var
>>>>>>> /dev/dm-2            1845747840 447502120 1398245720  25% /mnt/sdb
>>>>>>> /dev/dm-1            6140723200 4632947344 1507775856  76% /mnt/sdc
>>>>>>> /dev/dm-3            286696376   1461588 268850900   1% 
>>>>>>> /mnt/home-md/mdt
>>>>>>> [root at storage03 ~]# tunefs.lustre --print /dev/dm-3 |grep 
>>>>>>> "Lustre FS"
>>>>>>> Lustre FS:  home-md
>>>>>>> Lustre FS:  home-md
>>>>>>> [root at storage03 ~]# cat /proc/sys/lustre/timeout
>>>>>>> 100
>>>>>>> [root at storage03 ~]# lctl conf_param home-md.sys.timeout=150
>>>>>>> error: conf_param: Invalid argument
>>>>>>> [root at storage03 ~]#
>>>>>>>
>>>>>>>     
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>> You need to do this on the MGS node, with the MGS running.
>>>>>
>>>>> mgs> lctl conf_param testfs.sys.timeout=150
>>>>> anynode> cat /proc/sys/lustre/timeout
>>>>>
>>>>>       
>>>>>
>>>> This isn't working for me. In my production configuration I have 
>>>> MGS combined with MDT on the same server. My lustre configuration 
>>>> consists of two file systems.
>>>> [root at mds01 ~]# tunefs.lustre --print /dev/dm-0
>>>> checking for existing Lustre data: found CONFIGS/mountdata
>>>> Reading CONFIGS/mountdata
>>>>
>>>>    Read previous values:
>>>> Target:     ddn-home-MDT0000
>>>> Index:      0
>>>> Lustre FS:  ddn-home
>>>> Mount type: ldiskfs
>>>> Flags:      0x5
>>>>               (MDT MGS )
>>>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>>>> Parameters: failover.node=10.143.245.202 at tcp mgsnode=10.143.245.202 at tcp
>>>>
>>>>
>>>>    Permanent disk data:
>>>> Target:     ddn-home-MDT0000
>>>> Index:      0
>>>> Lustre FS:  ddn-home
>>>> Mount type: ldiskfs
>>>> Flags:      0x5
>>>>               (MDT MGS )
>>>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>>>> Parameters: failover.node=10.143.245.202 at tcp mgsnode=10.143.245.202 at tcp
>>>>
>>>> exiting before disk write.
>>>> [root at mds01 ~]# tunefs.lustre --print /dev/dm-1
>>>> checking for existing Lustre data: found CONFIGS/mountdata
>>>> Reading CONFIGS/mountdata
>>>>
>>>>    Read previous values:
>>>> Target:     ddn-data-MDT0000
>>>> Index:      0
>>>> Lustre FS:  ddn-data
>>>> Mount type: ldiskfs
>>>> Flags:      0x1
>>>>               (MDT )
>>>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>>>> Parameters: mgsnode=10.143.245.201 at tcp failover.node=10.143.245.202 at tcp
>>>>
>>>>
>>>>    Permanent disk data:
>>>> Target:     ddn-data-MDT0000
>>>> Index:      0
>>>> Lustre FS:  ddn-data
>>>> Mount type: ldiskfs
>>>> Flags:      0x1
>>>>               (MDT )
>>>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>>>> Parameters: mgsnode=10.143.245.201 at tcp failover.node=10.143.245.202 at tcp
>>>>
>>>> exiting before disk write.
>>>> [root at mds01 ~]# 
>>>> As you can see above MGS is on /dev/dm-0 combined with MDT for 
>>>> ddn-home file system.
>>>> If I try command line from your example I get this:
>>>> [root at mds01 ~]# lctl conf_param ddn-home.sys.timeout=200
>>>> error: conf_param: Invalid argument
>>>>
>>>> Server mds01 is 100% MGS node. What is wrong here then? The only 
>>>> two reasons for that problem I can think of is that file system 
>>>> name contain "-" character. However I didn't find anything in 
>>>> documentation that would say that this character is not allowed to 
>>>> be used. Another reason is that MGS is combined with MDS?
>>>>
>>>> syslog contains following messages:
>>>>
>>>> Nov  7 18:38:35 mds01 kernel: LustreError: 
>>>> 3273:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-home'
>>>> Nov  7 18:38:35 mds01 kernel: LustreError: 
>>>> 3273:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>> Nov  7 18:39:46 mds01 kernel: LustreError: 
>>>> 3274:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-data'
>>>> Nov  7 18:39:46 mds01 kernel: LustreError: 
>>>> 3274:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>> Nov  7 18:39:54 mds01 kernel: LustreError: 
>>>> 3275:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-data'
>>>> Nov  7 18:39:54 mds01 kernel: LustreError: 
>>>> 3275:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>> Nov  7 18:40:01 mds01 kernel: LustreError: 
>>>> 3282:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-data'
>>>> Nov  7 18:40:01 mds01 kernel: LustreError: 
>>>> 3282:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>> Nov  7 18:41:06 mds01 kernel: LustreError: 
>>>> 3305:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-data'
>>>> Nov  7 18:41:06 mds01 kernel: LustreError: 
>>>> 3305:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>> Nov  7 18:41:15 mds01 kernel: LustreError: 
>>>> 3306:0:(mgs_llog.c:1957:mgs_setparam()) No filesystem targets for 
>>>> ddn.  cfg_device from lctl is 'ddn-home'
>>>> Nov  7 18:41:15 mds01 kernel: LustreError: 
>>>> 3306:0:(mgs_handler.c:605:mgs_iocontrol()) setparam err -22
>>>>
>>>> From above it looks like only first part of file system name is 
>>>> recognized "ddn" and -home or -data is omitted.
>>>>
>>>> Please advise.
>>>>
>>>> Wojciech Turek
>>>>
>>>>     
>>>>
>>>
>>> You seem to have found a bug.  I just tried this myself and it 
>>> doesn't work with a "-" in the name.  Maybe use a '.' instead until 
>>> we fix it.
>>>
>>>   
>>>
>> Argh, sorry, that doesn't work with conf_param either.  But an 
>> underscore '_' does.  I'm filing a bug report...
>>
>
> Mr Wojciech Turek
> Assistant System Manager
> University of Cambridge
> High Performance Computing service 
> email: wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>
> tel. +441223763517
>
>
>