[Lustre-discuss] ksocknal_process_receive() Error -14 / Error -14 on read from ...

Gerdjan Busker busker at busker.org
Thu Mar 12 23:58:16 PDT 2009


Isaac Huang wrote:
> On Thu, Mar 12, 2009 at 03:29:40PM +0000, Gerd wrote:
>   
>> Hi,
>>
>> We have a 1.6.6 installation using InfiniBand attached DDN OST storage
>> and OSS'es connected to the network with 10GE adapters.  When running
>> iozone with ~40 1GE attached clients we see the following on the clients:
>>  ......
>> And this on the OSS:
>>
>> Mar 12 14:42:46 cs04r-sc-oss01-01 kernel: LustreError:
>> 5469:0:(socklnd_cb.c:1291:ksocknal_process_receive()) [ffff81001f6fc000]
>> Error -14 on read from 12345-172.23.98.133 at tcp ip 172.23.98.133:1021
>> Mar 12 14:42:46 cs04r-sc-oss01-01 kernel: LustreError:
>> 5469:0:(socklnd_cb.c:1291:ksocknal_process_receive()) Skipped 5 previous
>> similar messages
>>     
>
> Interesting, socket read function from TCP stack returned an EFAULT,
> which usually has something to do with userland memory access and
> permission stuff.
>
> Have you seen this error on other servers? What was the kernel version of
> the OSS? When did it happen during the iozone test? Were you able to
> reproduce it? Do you have any network related security module (e.g.
> like LSM) running on the server?
>   
This is on 2.6.18-92.1.10.el5_lustre.1.6.6.  We see this error shortly 
after starting iozone on all 8x OSS'es and (I believe) on most clients.  
I'm was thinking kernel limits or so, but we have done iozone runs where 
this doesn't happen.

Gerd.






More information about the lustre-discuss mailing list