[Lustre-discuss] Luster clients getting evicted

Brock Palen brockp at umich.edu
Fri Feb 8 10:09:10 PST 2008



Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>> MDT dmesg:
>>
>> LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@   
>> processing error (-107)  req at 000001002b
>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0 ref 0 fl Interpret:/ 
>> 0/0  rc -107/0
>> LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ###  
>> lock  callback timer expired: evicting cl
>> ient 2faf3c9e-26fb-64b7- 
>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid 10.164.0.141 at tcp   
>> ns: mds-nobackup
>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc:  
>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>> ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08  
>> expref:  372 pid 26925
>>
> The client was evicted because of this lock can not be released on  
> client
> on time. Could you provide the stack strace of client at that time?
>
> I assume increase obd_timeout could fix your problem. Then maybe
> you should wait 1.6.5 released, including a new feature  
> adaptive_timeout,
> which will adjust the timeout value according to the network  
> congestion
> and server load. And it should help your problem.

Waiting for the next version of lustre might be the best thing.  I  
had upped the timeout a few days back but the next day i had errors  
on the MDS box.  I have switched it back:

lctl conf_param nobackup-MDT0000.sys.timeout=300

I would love to give you that trace but I don't know how to get it.   
Is there a debug option to turn on in the clients? 
  



More information about the lustre-discuss mailing list