[Lustre-discuss] Luster clients getting evicted

Tom.Wang Tom.Wang at Sun.COM
Mon Feb 11 14:13:08 PST 2008


Hi, Aaron

FYI, the patch in 14360 will unlikely help your problem, since the 
problem here seems OST load is too high or stuck somewhere.  So we need
more information. Actually, we have met some similar problems when do 
unlink before. If increase obd_timeout could help you, that is good.
But if you could provide stack trace and console msg of OST at that 
time, if it is not much trouble to get these information,  that will
help us to figure out what happened there?

Thanks
WangDi

Aaron Knister wrote:
> So far it's helped. If this doesn't fix it I'm going to apply the   
> patch mentioned here - https://bugzilla.lustre.org/attachment.cgi?id=14006&action=edit 
>   I'll let you know how it goes. If you'd like a copy of the patched  
> version let me know. Are you running RHEL/SLES? what version of the OS  
> and lustre?
>
> -Aaron
>
> On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>
>   
>>>> I've increased the timeout to 300seconds and it has helped  
>>>> marginally.
>>>>         
>>> Hi Aaron;
>>>
>>> We set the timeout a big number (1000secs) on our 400 node cluster
>>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>>> of evictions.  In our case, it solved the problem.
>>>       
>> This feels excessive.  But at this point I guess Ill try it.
>>
>>     
>>> Cheers,
>>> Craig
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>       
>
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
>
> (301) 595-7000
> aaron at iges.org
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>   




More information about the lustre-discuss mailing list