[Lustre-discuss] lustre + nfs + alphas

Aaron Knister aaron at iges.org
Wed Dec 12 08:27:25 PST 2007


Yes, it turns out its bug 14379. I applied the provided patches and  
everything works fine now. Thanks for the follow up!

-Aaron

On Dec 12, 2007, at 11:23 AM, Oleg Drokin wrote:

> Hello!
>
> On Dec 11, 2007, at 6:51 PM, Aaron S. Knister wrote:
>
>> This is the strangest problem I have seen. I have a lustre  
>> filesystem mounted on a linux server and its being exported to  
>> various alpha systems. The alphas mount it just fine however under  
>> heavy load the NFS server stops responding, as does the lustre  
>> mount on the export server. The weird thing is that if i mount the  
>> nfs export on another nfs server and run the same benchmark  
>> (bonnie) everything is fine. The lustre mount on the export server  
>> can take a real pounding (ive seen it push 300MB/sec) so I don't  
>> know why nfs is crashing it.
>> On the nfs export server i see these messages--
>> Lustre: 4224:0:(o2iblnd_cb.c:412:kiblnd_handle_rx()) PUT_NACK from  
>> 192.168.64.70 at o2ib
>> LustreError: 4400:0:(client.c:969:ptlrpc_expire_one_request()) @@@  
>> timeout (sent at 1197415542, 100s ago)  req at ffff810827bfbc00 x38827/ 
>> t0 o36->data-MDT0000_UUID at 192.168.64.70@o2ib:12 lens 14256/672 ref  
>> 1 fl Rpc:/0/0 rc 0/-22
>> Lustre: data-MDT0000-mdc-ffff81082d702000: Connection to service  
>> data-MDT0000 via nid 192.168.64.70 at o2ib was lost; in progress  
>> operations using this service
>> will wait for recovery to complete.
>
> Any messages on mds at this time?
>
> Bye,
>    Oleg

Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
aaron at iges.org






More information about the lustre-discuss mailing list