[Lustre-discuss] fstab mount fails often

Arne Brutschy arne.brutschy at ulb.ac.be
Tue Nov 16 03:25:50 PST 2010


Hello,

> From the log, we can see that either your MGS node was not ready for
> connection yet, or there's network error between client and the MGS node.

No error on the server nor on the client. What else can it be? Maybe the
switch is bad, I can see RX errors on most of it's interfaces.

> Were you rebooting the MGS at the moment?

No. It's something that happenes regularly.

> Since you said there's no errors on the interface, you need to check
> the lnet connection and also verify that the MGS/MDT are up running.

As far as I can tell, everything seems to be set up correctly. I have
quite a simple setup (single network, single interface gbe).

Thanks
Arne

> 在 2010-11-15,下午11:32, Arne Brutschy 写道:
> 
> > Hi all,
> > 
> > I am mounting lustre through an fstab entry. This fails quite often, the
> > nodes end up without the lustre mount. Even when I log in, it take 2-3
> > tries to get it to mount. This is what I get:
> > 
> >        mount /lustre
> >        mount.lustre: mount 10.1.1.1 at tcp0:/lustre at /lustre failed: Cannot send after transport endpoint shutdown
> > 
> > This is /var/log/messages:
> > 
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(lib-move.c:2441:LNetPut()) Error sending PUT to 12345-10.1.1.1 at tcp: -113
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 2124:0:(events.c:66:request_out_callback()) @@@ type 4, status -113  req at d73d7c00 x1352468062535684/t0 o250->MGS at MGC10.1.1.1@tcp_0:26/25 lens 368/584 e 0 to 1 dl 1289834868 ref 2 fl Rpc:N/0/0 rc 0/0
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req at d73d7800 x1352468062535685/t0 o101->MGS at MGC10.1.1.1@tcp_0:26/25 lens 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 15c-8: MGC10.1.1.1 at tcp: The configuration from log 'lustre-client' failed (-108). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(llite_lib.c:1176:ll_fill_super()) Unable to process log: -108
> >        Nov 15 16:27:43 compute-1-10 kernel: LustreError: 29069:0:(obd_mount.c:2045:lustre_fill_super()) Unable to mount  (-108)
> > 
> > I have no errors on the interface, so I assume this is a timing problem.
> > Can I improve this through some timeout setting?
> > 
> > Cheers,
> > Arne
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 

-- 
Arne Brutschy
Ph.D. Student                    Email    arne.brutschy(AT)ulb.ac.be
IRIDIA CP 194/6                  Web      iridia.ulb.ac.be/~abrutschy
Universite' Libre de Bruxelles   Tel      +32 2 650 2273
Avenue Franklin Roosevelt 50     Fax      +32 2 650 2715
1050 Bruxelles, Belgium          (Fax at IRIDIA secretary)




More information about the lustre-discuss mailing list