[Lustre-discuss] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP

Huang Qiulan huangql at ihep.ac.cn
Sat Sep 12 23:00:47 PDT 2009


Dear, list


   In this days, we got a unusually error of OSS crash. And when we restart the
OSS and perform the recovery process in default. However,the OSS crashed a
short time later in the recovery status. Then we reboot it again and abort
recovery with the command:

lctl --device N abort_recovery
   But the OSS crash again with the following log. We have no idea what to
cause it. Please give us some ideas and I will be appreciated with your any
help.


Sep 13 00:44:52 boss15 kernel: LustreError: 23119:0:(ldlm_lib.c:1619:
target_send_reply_msg()) @@@ processing error (-19)  req at 000001045c2b5600
 x1464287/t0 o8-><?>@<?>:0/0 lens 240/0 e 0 to 0 dl 1252774392 ref 1 fl
Interpret:/0/0 rc -19/0
Sep 13 00:44:52 boss15 kernel: LustreError: 23119:0:(ldlm_lib.c:1619:
target_send_reply_msg()) Skipped 188 previous similar messages
Sep 13 00:44:59 boss15 kernel: LustreError: 23123:0:(ldlm_lib.c:819:
target_handle_connect()) besfs-OST0034: denying connection for new client 
202.122.33.82 at tcp (8e8a925f-f4cc-58c1-851a-b22bc2d63f3c): 141 clients in
recovery for 1199s
Sep 13 00:44:59 boss15 kernel: LustreError: 23123:0:(ldlm_lib.c:819:
target_handle_connect()) Skipped 2 previous similar messages
Sep 13 00:46:10 boss15 kernel: LustreError: 24613:0:(filter.c:3630:
filter_iocontrol()) aborting recovery for device besfs-OST0034
Sep 13 00:46:10 boss15 kernel: Lustre: besfs-OST0034: recovery period over; 115
clients never reconnected after 371s (281 clients did)
Sep 13 00:46:10 boss15 kernel: LustreError: 24613:0:(genops.c:1061:
class_disconnect_stale_exports()) besfs-OST0034: disconnecting 115 stale cl
ients
Sep 13 00:46:10 boss15 kernel: Lustre: besfs-OST0034: sending delayed replies
to recovered clients
Sep 13 00:46:10 boss15 kernel: Lustre: besfs-OST0034: received MDS connection
from 192.168.50.32 at tcp
Sep 13 00:46:10 boss15 kernel: Lustre: 22989:0:(filter.c:2830:
filter_destroy_precreated()) besfs-OST0034: deleting orphan objects from 5027
to
 5207
Sep 13 00:46:13 boss15 kernel: LustreError: 24625:0:(filter.c:3630:
filter_iocontrol()) aborting recovery for device besfs-OST0035
Sep 13 00:46:13 boss15 kernel: Lustre: besfs-OST0035: recovery period over; 112
clients never reconnected after 365s (284 clients did)
Sep 13 00:46:13 boss15 kernel: LustreError: 24625:0:(genops.c:1061:
class_disconnect_stale_exports()) besfs-OST0035: disconnecting 112 stale cl
ients
Sep 13 00:46:13 boss15 kernel: Lustre: besfs-OST0035: sending delayed replies
to recovered clients
Sep 13 00:46:13 boss15 kernel: Lustre: besfs-OST0035: received MDS connection
from 192.168.50.32 at tcp
Sep 13 00:46:13 boss15 kernel: Lustre: 23044:0:(filter.c:2830:
filter_destroy_precreated()) besfs-OST0035: deleting orphan objects from 5172
to
 5612
Sep 13 00:46:14 boss15 kernel: LustreError: 23109:0:(filter.c:1396:
filter_destroy_internal()) destroying objid 4902 ino 107326972 nlink 14727 
count 1
Sep 13 00:46:14 boss15 kernel: LustreError: 23109:0:(filter.c:1402:
filter_destroy_internal()) error unlinking objid 4902: rc -1
Sep 13 00:46:16 boss15 kernel: Unable to handle kernel NULL pointer dereference
at 0000000000000000 RIP: 
Sep 13 00:46:16 boss15 kernel: <ffffffff801ee7f2>{__memset+50}



Thanks,
Sarea



More information about the lustre-discuss mailing list