[Lustre-discuss] lustre can not mounted problem

Changer Van changerv at gmail.com
Sun Jan 13 21:36:00 PST 2008


Yes, the subnet manager has crashed.
I rebooted the infiniband switch, everything is fine now.

Regards,
Changer


On 1/9/08, Aaron Knister <aaron at iges.org> wrote:
>
>  I don't know if the voltaire IB stack is the same as OFED but I'm
> guessing it has a subnet manager. Check that. I've had similar issues when
> my subnet manager has crashed.
>
>  On Jan 9, 2008, at 3:08 AM, Changer Van wrote:
>
>  Network connection is down. I can not ping the other nodes.
> I ran the vstat command and found one of the port_state is
> 'port_initialize'.
> What does 'port_initialize' mean? Dose it mean my ib card is broken?
>
> 1 HCA found:
>         hca_id=InfiniHost_III_Ex0
>         pci_location={BUS=0x20,DEV/FUNC=0x00}
>         vendor_id=0x02C9
>         vendor_part_id=0x6282
>         hw_ver=0xA0
>         fw_ver=5.1.400
>         PSID=MT_0140000001
>         num_phys_ports=2
>                 port=1
>                 port_state=PORT_INITIALIZE
>                 sm_lid=0x0000
>                 port_lid=0x0000
>                 port_lmc=0x00
>                 max_mtu=2048
>                 port=2
>                 port_state=PORT_DOWN
>                 sm_lid=0x0000
>                 port_lid=0x0000
>                 port_lmc=0x00
>                 max_mtu=2048
> --
> Regards,
> Changer
>
>
> On Jan 9, 2008 3:27 AM, Klaus Steden <klaus.steden at thomson.net> wrote:
>
> >
> > If you're using IPoIB, you can use standard TCP/IP diagnostic tools the
> > same way you would on an Ethernet link (ifconfig, ping, traceroute, telnet,
> > etc.)
> >
> > If you're using a copper-to-optical converter in your data path as well,
> > the Emcore MIAs have link lights on them which will tell you if a physical
> > link is present (check the documentation). I know with STP InfiniBand
> > connectors, there is some ambiguity about terminology with some vendors and
> > manufacturers, and the fibre arrangement doesn't provide a lot of wiggle
> > room.
> >
> > Klaus
> >
> > On 1/7/08 7:56 PM, "Changer Van" <changerv at gmail.com>did etch on stone
> > tablets:
> >
> >
> >
> > On Jan 8, 2008 1:35 AM, Isaac Huang <He.Huang at sun.com> wrote:
> >
> > On Mon, Jan 07, 2008 at 06:20:52PM +0800, Changer Van wrote:
> > >    ......
> > >    # dmesg
> > >
> > >    LustreError: 4273:0:(viblnd.c :1890:kibnal_startup())
> > >
> > >             Can't find an active port on InfiniHost_III_Ex0
> >
> > It meant that viblnd couldn't find a port whose link state was active
> > on the hca InfiniHost_III_Ex0, i.e . no link on the device was usable.
> >
> > Was there any other error messages from viblnd before this one?
> >
> > There was no error messages but a related message
> > like 'ADDRCONF(NETDEV_UP):ipoib0: link is not ready'.
> >
> > Did you see this problem on just one node?
> >
> > There are four nodes which can not mount the lustre system.
> > The other nodes can mount the lustre but got the following error
> > messages:
> >
> > # dmesg
> > divert: not allocating divert_blk for non-ethernet device ipoib0
> > ERROR   : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
> >
> >      ip_route_output_key(127.0.0.1 <http://127.0.0.1>
> > <http://127.0.0.1/>) failed
> > new: ipoib_allow_arp_joins: 1
> > ERROR   : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
> >
> >      ip_route_output_key(11.0.0.4 <http://11.0.0.4> <http://11.0.0.4/> )
> > failed
> > ERROR   : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
> >
> >      ip_route_output_key(11.0.0.4 <http://11.0.0.4> <http://11.0.0.4/> )
> > failed
> > ERROR   : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
> >
> >      ip_route_output_key(11.0.0.4 <http://11.0.0.4> <http://11.0.0.4/> )
> > failed
> >
> > How can I check the link on the device? Thanks in advance.
> >
> >
> >
> >
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>
>   Aaron Knister
> Associate Systems Analyst
>  Center for Ocean-Land-Atmosphere Studies
>
>
> (301) 595-7000
> aaron at iges.org
>
>
>
>
>
>
>
>



-- 
Regards,
Changer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080114/0d65d4f8/attachment.htm>


More information about the lustre-discuss mailing list