[Lustre-discuss] lustre + nfs + alphas
Aaron Knister
aaron at iges.org
Wed Dec 12 08:27:25 PST 2007
Yes, it turns out its bug 14379. I applied the provided patches and
everything works fine now. Thanks for the follow up!
-Aaron
On Dec 12, 2007, at 11:23 AM, Oleg Drokin wrote:
> Hello!
>
> On Dec 11, 2007, at 6:51 PM, Aaron S. Knister wrote:
>
>> This is the strangest problem I have seen. I have a lustre
>> filesystem mounted on a linux server and its being exported to
>> various alpha systems. The alphas mount it just fine however under
>> heavy load the NFS server stops responding, as does the lustre
>> mount on the export server. The weird thing is that if i mount the
>> nfs export on another nfs server and run the same benchmark
>> (bonnie) everything is fine. The lustre mount on the export server
>> can take a real pounding (ive seen it push 300MB/sec) so I don't
>> know why nfs is crashing it.
>> On the nfs export server i see these messages--
>> Lustre: 4224:0:(o2iblnd_cb.c:412:kiblnd_handle_rx()) PUT_NACK from
>> 192.168.64.70 at o2ib
>> LustreError: 4400:0:(client.c:969:ptlrpc_expire_one_request()) @@@
>> timeout (sent at 1197415542, 100s ago) req at ffff810827bfbc00 x38827/
>> t0 o36->data-MDT0000_UUID at 192.168.64.70@o2ib:12 lens 14256/672 ref
>> 1 fl Rpc:/0/0 rc 0/-22
>> Lustre: data-MDT0000-mdc-ffff81082d702000: Connection to service
>> data-MDT0000 via nid 192.168.64.70 at o2ib was lost; in progress
>> operations using this service
>> will wait for recovery to complete.
>
> Any messages on mds at this time?
>
> Bye,
> Oleg
Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water
(301) 595-7001
aaron at iges.org
More information about the lustre-discuss
mailing list