[Lustre-discuss] "up" a router that is marked "down"

Michael Kluge Michael.Kluge at tu-dresden.de
Tue Jan 25 22:33:15 PST 2011


Hi Jeremy,

yup, it's marked "==== obsolete (DANGEROUS) ====", whatever, it did the
trick :)


Thanks a lot, Michael



Am Dienstag, den 25.01.2011, 18:55 -0500 schrieb Jeremy Filizetti: 
> Though I think its marked as development or experimental in the Lustre
> documention or source "lctl set_route" has worked fine for me in the
> past with no issues.
>  
> lctl set_route <nid> up
>  
> is the syntax I believe.
>  
> Jeremy
> 
> 
> On Tue, Jan 25, 2011 at 9:52 AM, Michael Kluge
> <Michael.Kluge at tu-dresden.de> wrote:
>         Jason, Michael,
>         
>         thanks y lot for your replies. I pinged everone from all
>         directions but
>         the router is still marked "down" on the client. I even
>         removed and
>         re-added the router entry via lctl --net tcp1 del_route
>         xyz at o2ib and
>         lctl --net tcp1 add_route xyz at o2ib . No luck. So I think I'll
>         wait for
>         the next maintenance window. Oh, and I forgot to mention that
>         the
>         servers run a 1.6.7.2, the router as well and the clients
>         1.8.5. Works
>         good so far.
>         
>         
>         Thanks, Michael
>         
>         
>         Am Dienstag, den 25.01.2011, 15:12 +0100 schrieb Temple
>         Jason: 
>         
>         > I've found that even with the Protocal Error, it still
>         works.
>         >
>         > -Jason
>         >
>         > -----Original Message-----
>         > From: lustre-discuss-bounces at lists.lustre.org
>         [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of
>         Michael Shuey
>         > Sent: martedì, 25. gennaio 2011 14:45
>         > To: Michael Kluge
>         > Cc: Lustre Diskussionsliste
>         > Subject: Re: [Lustre-discuss] "up" a router that is marked
>         "down"
>         >
>         > You'll want to add the "dead_router_check_interval" lnet
>         module
>         > parameter as soon as you are able.  As near as I can tell,
>         without
>         > that there's no automatic check to make sure the router is
>         alive.
>         >
>         > I've had some success in getting machines to recognize that
>         a router
>         > is alive again by doing an lctl ping of their side of a
>         router (e.g.,
>         > on a tcp0 client, `lctl ping <routerIP>@tcp0`, then `lctl
>         ping
>         > <routerIP>@o2ib0` from an o2ib0 client).  If you have a
>         server/client
>         > version mismatch, where lctl ping returns a protocol error,
>         you may be
>         > out of luck.
>         >
>         > --
>         > Mike Shuey
>         >
>         >
>         >
>         > On Tue, Jan 25, 2011 at 8:38 AM, Michael Kluge
>         > <Michael.Kluge at tu-dresden.de> wrote:
>         > > Hi list,
>         > >
>         > > if a Lustre router is down, comes back to life and the
>         servers do not
>         > > actively test the routers periodically: is it possible to
>         mark a Lustre
>         > > router as "up"? Or to tell the servers to ping the router?
>         > >
>         > > Or can I enable the "router pinger" in a live system
>         without unloading
>         > > and loading the Lustre kernel modules?
>         > >
>         > >
>         > > Regards, Michael
>         > >
>         > > --
>         > >
>         > > Michael Kluge, M.Sc.
>         > >
>         > > Technische Universität Dresden
>         > > Center for Information Services and
>         > > High Performance Computing (ZIH)
>         > > D-01062 Dresden
>         > > Germany
>         > >
>         > > Contact:
>         > > Willersbau, Room A 208
>         > > Phone:  (+49) 351 463-34217
>         > > Fax:    (+49) 351 463-37773
>         > > e-mail: michael.kluge at tu-dresden.de
>         > > WWW:    http://www.tu-dresden.de/zih
>         > >
>         > > _______________________________________________
>         > > Lustre-discuss mailing list
>         > > Lustre-discuss at lists.lustre.org
>         > > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>         > >
>         > >
>         > _______________________________________________
>         > Lustre-discuss mailing list
>         > Lustre-discuss at lists.lustre.org
>         > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>         >
>         
>         
>         -- 
>         
>         
>         Michael Kluge, M.Sc.
>         
>         Technische Universität Dresden
>         Center for Information Services and
>         High Performance Computing (ZIH)
>         D-01062 Dresden
>         Germany
>         
>         Contact:
>         Willersbau, Room A 208
>         Phone:  (+49) 351 463-34217
>         Fax:    (+49) 351 463-37773
>         e-mail: michael.kluge at tu-dresden.de
>         WWW:    http://www.tu-dresden.de/zih
>         
>         
>         _______________________________________________
>         Lustre-discuss mailing list
>         Lustre-discuss at lists.lustre.org
>         http://lists.lustre.org/mailman/listinfo/lustre-discuss
>         
> 

-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5973 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110126/ec03af69/attachment.bin>


More information about the lustre-discuss mailing list