[Lustre-discuss] fstab mount fails often
Wang Yibin
wang.yibin at oracle.com
Tue Nov 16 03:56:59 PST 2010
Hi,
在 2010-11-16,下午7:25, Arne Brutschy 写道:
> Hello,
>
>> From the log, we can see that either your MGS node was not ready for
>> connection yet, or there's network error between client and the MGS node.
>
> No error on the server nor on the client. What else can it be? Maybe the
> switch is bad, I can see RX errors on most of it's interfaces.
The switch could be the culprit - error message shows client failed to send request to MGS. Network sending status was -EHOSTUNREACH.
I suggest you reexamine the network of your system.
>
>> Were you rebooting the MGS at the moment?
>
> No. It's something that happenes regularly.
>
>> Since you said there's no errors on the interface, you need to check
>> the lnet connection and also verify that the MGS/MDT are up running.
>
> As far as I can tell, everything seems to be set up correctly. I have
> quite a simple setup (single network, single interface gbe).
>
> Thanks
> Arne
>
>> 在 2010-11-15,下午11:32, Arne Brutschy 写道:
>>
>>> Hi all,
>>>
>>> I am mounting lustre through an fstab entry. This fails quite often, the
>>> nodes end up without the lustre mount. Even when I log in, it take 2-3
>>> tries to get it to mount. This is what I get:
>>>
>>> mount /lustre
>>> mount.lustre: mount 10.1.1.1 at tcp0:/lustre at /lustre failed: Cannot send after transport endpoint shutdown
>>>
>>> This is /var/log/messages:
>>>
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(lib-move.c:2441:LNetPut()) Error sending PUT to 12345-10.1.1.1 at tcp: -113
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(events.c:66:request_out_callback()) @@@ type 4, status -113 req at d73d7c00 x1352468062535684/t0 o250->MGS at MGC10.1.1.1@tcp_0:26/25 lens 368/584 e 0 to 1 dl 1289834868 ref 2 fl Rpc:N/0/0 rc 0/0
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at d73d7800 x1352468062535685/t0 o101->MGS at MGC10.1.1.1@tcp_0:26/25 lens 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 15c-8: MGC10.1.1.1 at tcp: The configuration from log 'lustre-client' failed (-108). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(llite_lib.c:1176:ll_fill_super()) Unable to process log: -108
>>> Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(obd_mount.c:2045:lustre_fill_super()) Unable to mount (-108)
>>>
>>> I have no errors on the interface, so I assume this is a timing problem.
>>> Can I improve this through some timeout setting?
>>>
>>> Cheers,
>>> Arne
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
> --
> Arne Brutschy
> Ph.D. Student Email arne.brutschy(AT)ulb.ac.be
> IRIDIA CP 194/6 Web iridia.ulb.ac.be/~abrutschy
> Universite' Libre de Bruxelles Tel +32 2 650 2273
> Avenue Franklin Roosevelt 50 Fax +32 2 650 2715
> 1050 Bruxelles, Belgium (Fax at IRIDIA secretary)
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
More information about the lustre-discuss
mailing list