[Lustre-discuss] OSS errors

Johnlya johnlya at gmail.com
Mon Aug 4 00:06:24 PDT 2008


When the system resouce of Client is not enough, the OSS display some
errors:

Lustre: lenovo-OST0002: haven't heard from client 24bdc118-
cf78-9d56-190c-bb9a2836bd41 (at 192.168.1.251 at tcp) in 227 seconds. I
think it's dead, and I am evicting it.
Lustre: 6613:0:(ldlm_lib.c:525:target_handle_reconnect()) lenovo-
OST0000: 24bdc118-cf78-9d56-190c-bb9a2836bd41 reconnecting
Lustre: 6613:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 1
previous similar message
LustreError: 5881:0:(ldlm_resource.c:767:ldlm_resource_add())
lvbo_init failed for resource 2207359: rc -2
LustreError: 6969:0:(ldlm_lock.c:430:__ldlm_handle2lock())
ASSERTION(lock->l_resource != NULL) failed
LustreError: 6969:0:(tracefile.c:432:libcfs_assertion_failed()) LBUG
Lustre: 6969:0:(linux-debug.c:167:libcfs_debug_dumpstack()) showing
stack for process 6969
ldlm_cn_13    R  running task       0  6969      1          6970  6968
(L-TLB)
0000000000000000 ffffffffa031b4c9 0000010005fe1a00 0000000000000000
       00000100bffab240 ffffffffa01ee45e 0000010005eed598
0000000000000001
       0000010082899ea0 0000000000000000
Call Trace:<ffffffffa031b4c9>{:ptlrpc:ptlrpc_server_handle_request
+2457}
       <ffffffffa01ee45e>{:libcfs:lcw_update_time+30}
<ffffffff80133855>{__wake_up_common+67}
       <ffffffffa031dba5>{:ptlrpc:ptlrpc_main+3989}
<ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffff80110de3>{child_rip+8}
<ffffffffa031cc10>{:ptlrpc:ptlrpc_main+0}
       <ffffffff80110ddb>{child_rip+0}
LustreError: dumping log to /tmp/lustre-log.1216640103.6969
Lustre: 6495:0:(ldlm_lib.c:525:target_handle_reconnect()) lenovo-
OST0002: 440eafce-9f15-16a6-4764-7f54d92f9204 reconnecting
Lustre: 6495:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 2
previous similar messages
Lustre: 6495:0:(ldlm_lib.c:760:target_handle_connect()) lenovo-
OST0002: refuse reconnection from
440eafce-9f15-16a6-4764-7f54d92f9204 at 192.168.1.102@tcp to
0x0000010058fde000; still busy with 2 active RPCs
LustreError: 6495:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@
processing error (-16)  req at 0000010137f4b400 x68793117/t0 o8-
>440eafce-9f15-16a6-4764-7f54d92f9204 at NET_0x20000c0a80166_UUID:0/0
lens 304/200 e 0 to 0 dl 1216640303 ref 1 fl Interpret:/0/0 rc -16/0
Lustre: Request x103723701 sent from lenovo-OST0002 to NID
192.168.1.102 at tcp 20s ago has timed out (limit 20s).
Lustre: Skipped 6 previous similar messages
LustreError: 138-a: lenovo-OST0002: A client on nid 192.168.1.102 at tcp
was evicted due to a lock glimpse callback to 192.168.1.102 at tcp timed
out: rc -110
Lustre: 0:0:(watchdog.c:130:lcw_cb()) Watchdog triggered for pid 6969:
it was inactive for 600s
Lustre: 0:0:(linux-debug.c:167:libcfs_debug_dumpstack()) showing stack
for process 6969
ldlm_cn_13    D 0000000000000001     0  6969      1          6970
6968 (L-TLB)
000001008289db38 0000000000000046 0000000000000000 ffffffffa0201728
       0000000000000700 000001008289dac8 00000000000001b0
00000000a01e4a58
       0000010083163030 00000000000002d9
Call Trace:<ffffffffa01e9014>{:libcfs:libcfs_debug_dumplog+292}
       <ffffffffa01e4bb6>{:libcfs:lbug_with_loc+182}
<ffffffffa01ebb44>{:libcfs:libcfs_assertion_failed+84}
       <ffffffffa02d44e8>{:ptlrpc:__ldlm_handle2lock+328}
       <ffffffffa03141f4>{:ptlrpc:lustre_msg_set_timeout+52}
       <ffffffffa03124c7>{:ptlrpc:lustre_msg_get_flags+87}
       <ffffffffa02f182d>{:ptlrpc:ldlm_request_cancel+525}
       <ffffffffa030fd79>{:ptlrpc:lustre_pack_reply+41}
<ffffffffa0315890>{:ptlrpc:lustre_swab_ldlm_request+0}
       <ffffffffa02f2e34>{:ptlrpc:ldlm_handle_cancel+532}
       <ffffffffa0312dcf>{:ptlrpc:lustre_msg_get_opc+95}
<ffffffffa030f1af>{:ptlrpc:lustre_msg_get_conn_cnt+95}
       <ffffffffa02f53ba>{:ptlrpc:ldlm_cancel_handler+730}
       <ffffffffa03192f1>{:ptlrpc:ptlrpc_check_req+17}
<ffffffffa0312baf>{:ptlrpc:lustre_msg_get_handle+79}
       <ffffffffa031b4c9>{:ptlrpc:ptlrpc_server_handle_request+2457}
       <ffffffffa01ee45e>{:libcfs:lcw_update_time+30}
<ffffffff80133855>{__wake_up_common+67}
       <ffffffffa031dba5>{:ptlrpc:ptlrpc_main+3989}
<ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffffa031c110>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffff80110de3>{child_rip+8}
<ffffffffa031cc10>{:ptlrpc:ptlrpc_main+0}
       <ffffffff80110ddb>{child_rip+0}
LustreError: dumping log to /tmp/lustre-log.1216640703.6969
LustreError: 6701:0:(ldlm_resource.c:767:ldlm_resource_add())
lvbo_init failed for resource 3973448: rc -2



More information about the lustre-discuss mailing list