[Lustre-discuss] fstab mount fails often

Wang Yibin wang.yibin at oracle.com
Tue Nov 16 03:56:59 PST 2010


Hi,

在 2010-11-16,下午7:25, Arne Brutschy 写道:

> Hello,
> 
>> From the log, we can see that either your MGS node was not ready for
>> connection yet, or there's network error between client and the MGS node.
> 
> No error on the server nor on the client. What else can it be? Maybe the
> switch is bad, I can see RX errors on most of it's interfaces.

The switch could be the culprit - error message shows client failed to send request to MGS. Network sending status was -EHOSTUNREACH.
I suggest you reexamine the network of your system.

> 
>> Were you rebooting the MGS at the moment?
> 
> No. It's something that happenes regularly.
> 
>> Since you said there's no errors on the interface, you need to check
>> the lnet connection and also verify that the MGS/MDT are up running.
> 
> As far as I can tell, everything seems to be set up correctly. I have
> quite a simple setup (single network, single interface gbe).
> 
> Thanks
> Arne
> 
>> 在 2010-11-15,下午11:32, Arne Brutschy 写道:
>> 
>>> Hi all,
>>> 
>>> I am mounting lustre through an fstab entry. This fails quite often, the
>>> nodes end up without the lustre mount. Even when I log in, it take 2-3
>>> tries to get it to mount. This is what I get:
>>> 
>>>       mount /lustre
>>>       mount.lustre: mount 10.1.1.1 at tcp0:/lustre at /lustre failed: Cannot send after transport endpoint shutdown
>>> 
>>> This is /var/log/messages:
>>> 
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(lib-move.c:2441:LNetPut()) Error sending PUT to 12345-10.1.1.1 at tcp: -113
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(events.c:66:request_out_callback()) @@@ type 4, status -113  req at d73d7c00 x1352468062535684/t0 o250->MGS at MGC10.1.1.1@tcp_0:26/25 lens 368/584 e 0 to 1 dl 1289834868 ref 2 fl Rpc:N/0/0 rc 0/0
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req at d73d7800 x1352468062535685/t0 o101->MGS at MGC10.1.1.1@tcp_0:26/25 lens 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 15c-8: MGC10.1.1.1 at tcp: The configuration from log 'lustre-client' failed (-108). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(llite_lib.c:1176:ll_fill_super()) Unable to process log: -108
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(obd_mount.c:2045:lustre_fill_super()) Unable to mount  (-108)
>>> 
>>> I have no errors on the interface, so I assume this is a timing problem.
>>> Can I improve this through some timeout setting?
>>> 
>>> Cheers,
>>> Arne
>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> 
> -- 
> Arne Brutschy
> Ph.D. Student                    Email    arne.brutschy(AT)ulb.ac.be
> IRIDIA CP 194/6                  Web      iridia.ulb.ac.be/~abrutschy
> Universite' Libre de Bruxelles   Tel      +32 2 650 2273
> Avenue Franklin Roosevelt 50     Fax      +32 2 650 2715
> 1050 Bruxelles, Belgium          (Fax at IRIDIA secretary)
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list