[Lustre-discuss] Cannot send after transport endpoint shutdown (-108)

Aaron Knister aaron at iges.org
Wed Mar 5 15:00:40 PST 2008


Are the clients SuSE, redhat or another distro? I can't get OFED  
1.2.5.4 to build with rhel5 but im working on that.

On Mar 5, 2008, at 2:03 PM, Frank Leers wrote:

> On Wed, 2008-03-05 at 13:37 -0500, Aaron Knister wrote:
>> Could you tell me what version of OFED was being used? Was it the
>> version that ships with the kernel?
>
> OFED version is 1.2.5.4
>
>>
>> -Aaron
>>
>> On Mar 5, 2008, at 11:33 AM, Frank Leers wrote:
>>
>>> On Wed, 2008-03-05 at 11:08 -0500, Aaron Knister wrote:
>>>> That's very strange. What interconnect is that site using?
>>>>
>>>
>>> Not really strange, but -
>>>
>>> SDR IB/OFED
>>>
>>> lustre 1.6.4.2
>>> 2.6.18.8 clients
>>> 2.6.9-55.0.9 servers
>>>
>>>> My versions are -
>>>>
>>>> Lustre  - 1.6.4.2
>>>> Kernel (servers) - 2.6.18-8.1.14.el5_lustre.1.6.4.2smp
>>>> Kernel (clients) - 2.6.18-53.1.13.el5
>>>>
>>>>
>>>>
>>>> On Mar 5, 2008, at 11:03 AM, Frank Leers wrote:
>>>>
>>>>> On Tue, 2008-03-04 at 22:04 +0100, Brian J. Murrell wrote:
>>>>>> On Tue, 2008-03-04 at 15:55 -0500, Aaron S. Knister wrote:
>>>>>>> I think I tried that before and it didn't help, but I will try  
>>>>>>> it
>>>>>>> again. Thanks for the suggestion.
>>>>>>
>>>>>> Just so you guys know, 1000 seconds for the obd_timeout is very,
>>>>>> very
>>>>>> large!  As you could probably guess, we have some very, very big
>>>>>> Lustre
>>>>>> installations and to the best of my knowledge none of them are
>>>>>> using
>>>>>> anywhere near that.  AFAIK (and perhaps a Sun engineer with  
>>>>>> closer
>>>>>> experience to some of these very large clusters might correct me)
>>>>>> the
>>>>>> largest value that the largest clusters are using is in the
>>>>>> neighbourhood of 300s.  There has to be some other problem at  
>>>>>> play
>>>>>> here
>>>>>> that you need 1000s.
>>>>>
>>>>> I can confirm that at a recent large installation with several
>>>>> thousand
>>>>> clients, the default of 100 is in effect.
>>>>>
>>>>>>
>>>>>> Can you both please report your lustre and kernel versions?  I  
>>>>>> know
>>>>>> you
>>>>>> said "latest" Aaron, but some version numbers might be more solid
>>>>>> to go
>>>>>> on.
>>>>>>
>>>>>> b.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>> Aaron Knister
>>>> Associate Systems Analyst
>>>> Center for Ocean-Land-Atmosphere Studies
>>>>
>>>> (301) 595-7000
>>>> aaron at iges.org
>>>>
>>>>
>>>>
>>>>
>>>
>>
>> Aaron Knister
>> Associate Systems Analyst
>> Center for Ocean-Land-Atmosphere Studies
>>
>> (301) 595-7000
>> aaron at iges.org
>>
>>
>>
>>
>

Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org







More information about the lustre-discuss mailing list