[Lustre-discuss] Lustre can not be mounted issue
Changer Van
changerv at gmail.com
Thu Jan 17 01:46:10 PST 2008
Hi all,
The Lustre FS has crashed after the entire system was rebooted.
Here are some error messages as follows:
On n01-ib0 (one of the clients)
-------------------------------
# /usr/sbin/lconf --node n01-ib0 /etc/lustre/config.xml
MDC: MDC_n01.local_mds_master2_MNT_n01-ib0_2
23503_MNT_n01-ib0_2_a7896cb070 mds_master2_UUID
MDC: MDC_n01.local_mds_master2_MNT_n01-ib0_2
23503_MNT_n01-ib0_2_a7896cb070
! /usr/sbin/lctl (255): IOC_PORTAL_DEL_UUID failed: Invalid argument
# dmesg
ERROR : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c)
:ip_route_output_key(127.0.0.1) failed
... ...
LustreError: 5208:0:(client.c:947:ptlrpc_expire_one_request())
@@@ timeout (sent at 1200501512, 5s ago) req at 0000010117c34600
x1/t0 o38->mds_master2_UUID at s03-ib0_UUID:12 lens 240/272 ref 1 fl
Rpc:/0/0 rc 0/0
Lustre: Changing connection for MDC_n01.local_mds_master2_MNT_n01-ib0_2
to s04-ib0_UUID/11.0.0.249 at vib
LustreError: 5208:0:(client.c:947:ptlrpc_expire_one_request())
@@@ timeout (sent at 1200501517, 5s ago) req at 0000010117c34600
x3/t0 o38->mds_master2_UUID at s04-ib0_UUID:12 lens 240/272 ref 1 fl
Rpc:/0/0 rc 0/0
Lustre: Changing connection for MDC_n01.local_mds_master2_MNT_n01-ib0_2
to s03-ib0_UUID/11.0.0.250 at vib
... ...
Lustre: Skipped 39 previous similar messages
ERROR : AD_TAVOR : vvi_mlx_poll_for_completion:(adaptor_tavor.c):VLT:
completion_status: 10 (MLX: 12, syndrom: 129), total err num: 5
(not print flush errors)
LustreError: 4941:0:(events.c:53:request_out_callback()) @@@ type 4, status
-113 req at 000001011294f200 x2794/t0 o38->mds_master2_UUID at s03-ib0_UUID:12
lens 240/272 ref 2 fl Rpc:/0/0 rc 0/0
LustreError: 4941:0:(events.c:53:request_out_callback())
Skipped 14 previous similar messages
LustreError: 23731:0:(obd_config.c:333:class_cleanup()) OBD
MDC_n01.local_mds_master2_MNT_n01-ib0_2 is still busy with 5 references
You should stop active file system users, or use the --force option to
cleanup.
LustreError: 23731:0:(obd_config.c:234:class_detach()) OBD device 2 still
set up
LustreError: 23732:0:(lustre_peer.c:148:class_del_uuid()) delete
non-existent
uuid s03-ib0_UUID
On s03-ib0 (failover MDS with s04-ib0)
--------------------------------------
# traceroute 11.0.0.1
traceroute to 11.0.0.1 (11.0.0.1), 30 hops max, 46 byte packets
1 n01-ib0 (11.0.0.1) 0.149 ms 0.086 ms 0.088 ms
# dmesg
ERROR : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
ip_route_output_key(127.0.0.1) failed
new: ipoib_allow_arp_joins: 1
Linux Kernel Card Services
options: [pci] [cardbus] [pm]
ERROR : IPOIB_UD : ipoib_ud_find_dev_by_dst:(ipoib_ud_arp.c):
ip_route_output_key(11.0.0.4) failed
... ...
Lustre: Added LNI 11.0.0.250 at vib [8/128]
Lustre: 4362:0:(lib-move.c:1644:lnet_parse_put()) Dropping PUT from
12345-11.0.0.3 at vib portal 12 match 3734 offset 0 length 240: 2
Lustre: 4362:0:(lib-move.c:1644:lnet_parse_put()) Dropping PUT from
12345-11.0.0.15 at vib portal 12 match 3736 offset 0 length 240: 2
The error messages like 'ip_route_output_key(*) failed' means
there is probably wrong routing IPOIB interface configuration.
But both IPOIB Interface configuration and node routing table
seem to be OK. Any help would be greatly appreciated.
--
Regards,
Changer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080117/0d34aedd/attachment.htm>
More information about the lustre-discuss
mailing list