[Lustre-discuss] lustre error

Papp Tamás tompos at martos.bme.hu
Fri Feb 22 02:57:28 PST 2008


Dear All,

Yesterday evening or cluster has stopped.
Two of our nodes tried to take the resource from each other, they 
haven't seen the other side, if I saw well.

I stopped heartbeat, resources, start it again, and back to online, 
worked fine.

This morning I saw this in logs:

Feb 22 03:25:07 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.139@
tcp,down,1203647043
Feb 22 03:25:16 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.15 at t
cp,down,1203647045
Feb 22 03:25:17 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.17 at t
cp,down,1203647044
Feb 22 03:25:24 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.179@
tcp,down,1203647064
Feb 22 03:25:24 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 2 previous similar 
messages
Feb 22 03:25:29 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.11 at t
cp,down,1203647123
Feb 22 03:25:33 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.13
Feb 22 03:25:43 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.17
Feb 22 03:25:59 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.13
Feb 22 03:26:04 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.179
Feb 22 03:26:04 node4 kernel: LustreError: 
4564:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO 
from 192.168.0.139
Feb 22 03:26:09 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.120
Feb 22 03:26:13 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.11
Feb 22 03:26:14 node4 kernel: Lustre: 
4816:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
2a02ce4a-c2cf-36f6-1cf1-82a5c4b22459 reconnecting
Feb 22 03:26:14 node4 kernel: Lustre: 
4671:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
3e64ed95-8693-9c34-a32e-b803bda9017c reconnecting
Feb 22 03:26:29 node4 kernel: Lustre: 
4750:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
c286201b-ac3e-07d6-a17b-985129e6b10d reconnecting
Feb 22 03:26:32 node4 kernel: Lustre: 
4675:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
3b5cafac-fa5a-1040-3749-3b9530401684 reconnecting
Feb 22 03:26:35 node4 kernel: Lustre: 
4665:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
9220135a-ffbc-0c99-6187-eb7c05c7e008 reconnecting
Feb 22 03:26:36 node4 kernel: Lustre: 
4785:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
e3dad450-aa12-6959-a62b-48dd320936ff reconnecting
Feb 22 03:26:43 node4 kernel: Lustre: 
4795:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
9762fef8-bb47-ca87-d2cd-7c439607c523 reconnecting
Feb 22 03:26:44 node4 kernel: Lustre: 
4814:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
9c0b2b34-745e-23f2-dd10-3a60add8b9b5 reconnecting
Feb 22 03:26:48 node4 kernel: Lustre: 
4821:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
6c2b9a02-028e-f8bb-3cd2-aa10433721ce reconnecting
Feb 22 03:26:50 node4 kernel: Lustre: 
4781:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
87c70d18-3f76-7ba3-7c88-1c171e6acb08 reconnecting
Feb 22 03:26:54 node4 kernel: Lustre: 
4738:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
5b1bc354-6528-1343-0f2b-6a449c0cfe3e reconnecting
Feb 22 03:26:58 node4 kernel: Lustre: 
4819:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
bf28f4b2-f9aa-5d83-a1a3-84964a8b525c reconnecting
Feb 22 03:27:11 node4 kernel: Lustre: 
4769:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
948dcff3-b2da-b501-c2fc-3b9fcf85115b reconnecting
Feb 22 03:27:16 node4 kernel: Lustre: 
4659:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
8100747d-014b-deba-dd95-23973440bc17 reconnecting
Feb 22 03:27:38 node4 kernel: Lustre: 
4655:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
f1ba7827-0ffe-69e3-3809-e602b55aab49 reconnecting
Feb 22 04:00:50 node4 kernel: Lustre: 
6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.12 at t
cp,down,1203649199
Feb 22 04:00:50 node4 kernel: Lustre: 
6:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 1 previous similar 
message
Feb 22 04:00:51 node4 kernel: Lustre: 
6:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.120@
tcp,down,1203649172
Feb 22 04:01:01 node4 kernel: LustreError: 
4562:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO 
from 192.168.0.179
Feb 22 04:01:04 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.22 at t
cp,down,1203649206
Feb 22 04:01:04 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Skipped 1 previous similar 
message
Feb 22 04:01:20 node4 kernel: LustreError: 
4563:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO 
from 192.168.0.11


On the other side:

Feb 22 03:25:46 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at e8c5d800 x79341/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 03:25:46 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 14 previous 
similar messages
Feb 22 03:25:46 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at e8c5d800 x79341/t0 o8-><?>@<?>:-1
 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 03:25:46 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 30 previous 
similar messages
Feb 22 03:26:06 node3 kernel: LustreError: 
16609:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at f7fcec2c x300361/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 03:26:06 node3 kernel: LustreError: 
16609:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at f7fcec2c x300361/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 03:26:49 node3 kernel: LustreError: 
16606:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at c19f0400 x357320/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 03:26:49 node3 kernel: LustreError: 
16606:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 9 previous 
similar messages
Feb 22 03:26:49 node3 kernel: LustreError: 
16606:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at c19f0400 x357320/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 03:26:49 node3 kernel: LustreError: 
16606:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 9 previous 
similar messages
Feb 22 04:01:30 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at d607e200 x301042/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 04:01:30 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 2 previous 
similar messages
Feb 22 04:01:30 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at d607e200 x301042/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 04:01:30 node3 kernel: LustreError: 
16228:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 2 previous 
similar messages
Feb 22 04:01:45 node3 kernel: LustreError: 
16610:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at c1b13a00 x127933/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 04:01:45 node3 kernel: LustreError: 
16610:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 4 previous 
similar messages
Feb 22 04:01:45 node3 kernel: LustreError: 
16610:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at c1b13a00 x127933/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 04:01:45 node3 kernel: LustreError: 
16610:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 4 previous 
similar messages



And so on, couple of time. After that:

Feb 22 11:16:20 node4 kernel: Lustre: hallmark-OST0004: haven't heard 
from client 11e65f33-019b-c3cc-17d9-2ccf559a86cd (at 192.168.0.173 at tcp) 
in 227 seconds.
 I think it's dead, and I am evicting it.
Feb 22 11:19:12 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.183@
tcp,down,1203675510
Feb 22 11:19:13 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.187@
tcp,down,1203675501
Feb 22 11:19:21 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.150@
tcp,down,1203675540
Feb 22 11:19:25 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.184@
tcp,down,1203675509
Feb 22 11:19:31 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.130@
tcp,down,1203675493
Feb 22 11:19:38 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.139
Feb 22 11:19:41 node4 kernel: Lustre: 
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,192.168.0.106@
tcp,down,1203675499
Feb 22 11:19:43 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.16
Feb 22 11:19:48 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.12
Feb 22 11:19:53 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.68
Feb 22 11:19:58 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.187
Feb 22 11:20:03 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.183
Feb 22 11:20:10 node4 kernel: LustreError: 
4563:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO 
from 192.168.0.166
Feb 22 11:20:12 node4 kernel: LustreError: 
4562:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -11 reading HELLO 
from 192.168.0.17
Feb 22 11:20:12 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.168
Feb 22 11:20:17 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.139
Feb 22 11:20:22 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.18
Feb 22 11:20:27 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.68
Feb 22 11:20:32 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.138
Feb 22 11:20:42 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection 
request from 192.168.0.112
Feb 22 11:20:42 node4 kernel: LustreError: 
4567:0:(acceptor.c:442:lnet_acceptor()) Skipped 1 previous similar message
Feb 22 11:20:47 node4 kernel: Lustre: 
4810:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
88e387a4-d83e-de76-51e7-6db0118d556e reconnecting
Feb 22 11:20:53 node4 kernel: Lustre: 
4749:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
152f0c05-d8cd-99d2-7d79-248cf7c45cf2 reconnecting
Feb 22 11:20:55 node4 kernel: Lustre: 
4789:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
f1ba7827-0ffe-69e3-3809-e602b55aab49 reconnecting
Feb 22 11:20:55 node4 kernel: Lustre: 
4789:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 11 previous 
similar messages
Feb 22 11:20:56 node4 kernel: Lustre: 
4680:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
72c636f0-a9e8-646c-2052-94898f85d173 reconnecting
Feb 22 11:20:57 node4 kernel: Lustre: 
4758:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
87c70d18-3f76-7ba3-7c88-1c171e6acb08 reconnecting
Feb 22 11:20:59 node4 kernel: Lustre: 
4815:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
092b01c8-de04-7b82-e833-f00921db6dce reconnecting
Feb 22 11:21:01 node4 kernel: Lustre: 
4812:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
5b1bc354-6528-1343-0f2b-6a449c0cfe3e reconnecting
Feb 22 11:21:01 node4 kernel: Lustre: 
4812:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 3 previous 
similar messages
Feb 22 11:21:05 node4 kernel: Lustre: 
4801:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
bf28f4b2-f9aa-5d83-a1a3-84964a8b525c reconnecting
Feb 22 11:21:10 node4 kernel: Lustre: 
4787:0:(ldlm_lib.c:709:target_handle_connect()) hallmark-OST0004: refuse 
reconnection from 8b167c6e-719d-f424-deaf-ff06
f26cccc5 at 192.168.0.106@tcp to 0xe9c77000/3
Feb 22 11:21:10 node4 kernel: LustreError: 
4787:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-16) req at e8539a00 x78923/t0 o8->8b167c6e-71
9d-f424-deaf-ff06f26cccc5 at NET_0x20000c0a8006a_UUID:-1 lens 304/200 ref 0 
fl Interpret:/0/0 rc -16/0
Feb 22 11:21:20 node4 kernel: Lustre: 
4731:0:(ldlm_lib.c:497:target_handle_reconnect()) hallmark-OST0004: 
56ad5d46-5237-688d-38d0-88655ff809bc reconnecting
Feb 22 11:21:20 node4 kernel: Lustre: 
4731:0:(ldlm_lib.c:497:target_handle_reconnect()) Skipped 4 previous 
similar messages
Feb 22 11:21:53 node4 kernel: Lustre: hallmark-OST0004: haven't heard 
from client 4493c464-67d6-6825-7062-f932d392c1df (at 192.168.0.168 at tcp) 
in 222 seconds.
 I think it's dead, and I am evicting it.
Feb 22 11:21:53 node4 kernel: Lustre: hallmark-OST0004: haven't heard 
from client 9762fef8-bb47-ca87-d2cd-7c439607c523 (at 192.168.0.158 at tcp) 
in 212 seconds.
 I think it's dead, and I am evicting it.

Other side:

Feb 22 11:16:21 node3 kernel: Lustre: hallmark-OST0003: haven't heard 
from client 11e65f33-019b-c3cc-17d9-2ccf559a86cd (at 192.168.0.173 at tcp) 
in 227 seconds.
 I think it's dead, and I am evicting it.
Feb 22 11:20:13 node3 kernel: LustreError: 
16617:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at f06ce600 x182177/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:13 node3 kernel: LustreError: 
16617:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at f06ce600 x182177/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:13 node3 kernel: LustreError: 
16607:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at e6b56000 x9091/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:13 node3 kernel: LustreError: 
16607:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at e6b56000 x9091/t0 o8-><?>@<?>:-1
lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:13 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at ee1f2200 x40504/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:13 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at ee1f2200 x40504/t0 o8-><?>@<?>:-1
 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:17 node3 kernel: LustreError: 
16597:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at ee1f2c00 x145702/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:17 node3 kernel: LustreError: 
16597:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at ee1f2c00 x145702/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:18 node3 kernel: LustreError: 
16605:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at d88bd800 x38335/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:18 node3 kernel: LustreError: 
16605:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous 
similar message
Feb 22 11:20:18 node3 kernel: LustreError: 
16605:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at d88bd800 x38335/t0 o8-><?>@<?>:-1
 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:18 node3 kernel: LustreError: 
16605:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous 
similar message
Feb 22 11:20:20 node3 kernel: LustreError: 
16612:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at ee1f2200 x142768/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:20 node3 kernel: LustreError: 
16612:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 4 previous 
similar messages
Feb 22 11:20:20 node3 kernel: LustreError: 
16612:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at ee1f2200 x142768/t0 o8-><?>@<?>:-
1 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:20 node3 kernel: LustreError: 
16612:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 4 previous 
similar messages
Feb 22 11:20:27 node3 kernel: LustreError: 
16611:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at c3c3f000 x7268/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:27 node3 kernel: LustreError: 
16611:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous 
similar message
Feb 22 11:20:27 node3 kernel: LustreError: 
16611:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at c3c3f000 x7268/t0 o8-><?>@<?>:-1
lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:27 node3 kernel: LustreError: 
16611:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous 
similar message
Feb 22 11:20:32 node3 kernel: LustreError: 
16608:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at c3c3f000 x171/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:32 node3 kernel: LustreError: 
16608:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 6 previous 
similar messages
Feb 22 11:20:32 node3 kernel: LustreError: 
16608:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at c3c3f000 x171/t0 o8-><?>@<?>:-1 l
ens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:32 node3 kernel: LustreError: 
16608:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 6 previous 
similar messages
Feb 22 11:20:46 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at ee1f2200 x18416/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:20:46 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 7 previous 
similar messages
Feb 22 11:20:46 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at ee1f2200 x18416/t0 o8-><?>@<?>:-1
 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:20:46 node3 kernel: LustreError: 
16604:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 7 previous 
similar messages
Feb 22 11:21:08 node3 kernel: LustreError: 
16599:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at d2334800 x380465/t0 o8-><?>@<?>:-1 lens 240/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:21:08 node3 kernel: LustreError: 
16599:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 3 previous 
similar messages
Feb 22 11:21:08 node3 kernel: LustreError: 
16599:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at d2334800 x380465/t0 o8-><?>@<?>:-
1 lens 240/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:21:08 node3 kernel: LustreError: 
16599:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 3 previous 
similar messages
Feb 22 11:22:23 node3 kernel: LustreError: 
16227:0:(ldlm_lib.c:576:target_handle_connect()) @@@ UUID 
'hallmark-OST0004_UUID' is not available  for connect (n
o target) req at f7c9bc2c x11418/t0 o8-><?>@<?>:-1 lens 304/0 ref 0 fl 
Interpret:/0/0 rc 0/0
Feb 22 11:22:23 node3 kernel: LustreError: 
16227:0:(ldlm_lib.c:576:target_handle_connect()) Skipped 1 previous 
similar message
Feb 22 11:22:23 node3 kernel: LustreError: 
16227:0:(ldlm_lib.c:1363:target_send_reply_msg()) @@@ processing error 
(-19) req at f7c9bc2c x11418/t0 o8-><?>@<?>:-1
 lens 304/0 ref 0 fl Interpret:/0/0 rc -19/0
Feb 22 11:22:23 node3 kernel: LustreError: 
16227:0:(ldlm_lib.c:1363:target_send_reply_msg()) Skipped 1 previous 
similar message
Feb 22 11:23:49 node3 kernel: Lustre: hallmark-OST0003: haven't heard 
from client 9762fef8-bb47-ca87-d2cd-7c439607c523 (at 192.168.0.158 at tcp) 
in 227 seconds.
 I think it's dead, and I am evicting it.



The cluster is now online. But what's going on? What is the router 
notify message, why was it lost the connections with lnet?

I just can't figure out, hat going on.

Thank you very much.

tamas



More information about the lustre-discuss mailing list