[Lustre-discuss] Lustre 1.8 reboot problem

Pedro Lorente pedro.lorente at i2cat.net
Mon May 18 08:33:24 PDT 2009


Hi,

I recently update lustre to 1.8. In order to avoid problems in the 
update I install from the begining all the stuff (SO, lustre, ...). My 
system is based on a Debian Lenny with the kernel 2.6.22, patched 
without problems. I setup 1 mgs/mdt and 2 ost in 2 servers (one for 
mgs/mdt and ost, and another for ost only).

I'm doing some test to verify the robustness of lustre in case of random 
crashes of the server or network, in this case I have a down ost and 
when I remount the ost, it enters in recovering status. At this moment I 
send to server a "reboot" and it crash dumping this message:

LustreError: dumping log to /tmp/lustre-log.1242667129.4109
BUG: unable to handle kernel paging request at virtual address f8b1ac60
 printing eip:
f8b1ac60
*pde = 02113067
*pte = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: ksocklnd ptlrpc obdclass lvfs lnet libcfs qla2xxx
CPU:    0
EIP:    0060:[<f8b1ac60>]    Tainted: G  R    VLI
EFLAGS: 00010246   (2.6.22.19 #9)
EIP is at 0xf8b1ac60
eax: f6877800   ebx: 0000000f   ecx: 00000000   edx: f73eda00
esi: f6d014c0   edi: f6877800   ebp: f6dd1f44   esp: f6dd1e90
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Process ll_ost_00 (pid: 3881, ti=f6dd0000 task=f6d08a50 task.ti=f6dd0000)
Stack: f93cf620 00000000 2dfd6400 00000000 00000003 f8b73c10 00000000 
00000000
       00000001 f70cbf5c c03d78f5 f6dd1f44 00000000 00000000 c03d9406 
00000001
       f6d08a50 c0118fd0 c20104a0 ac7f8700 f73eda00 f73edb38 f6d01400 
00000086
Call Trace:
 [<f93cf620>] ptlrpc_server_handle_request+0xac0/0x1ee0 [ptlrpc]
 [<c03d78f5>] schedule+0x2d5/0xa20
 [<c03d9406>] __down+0xe6/0x100
 [<c0118fd0>] default_wake_function+0x0/0x10
 [<f8b6788d>] lc_watchdog_touch_ms+0xad/0x280 [libcfs]
 [<f8b668f1>] lc_watchdog_disable+0x81/0x260 [libcfs]
 [<c0115e28>] __wake_up+0x38/0x50
 [<f93d2dce>] ptlrpc_main+0x82e/0x2270 [ptlrpc]
 [<c0118fd0>] default_wake_function+0x0/0x10
 [<f93d25a0>] ptlrpc_main+0x0/0x2270 [ptlrpc]
 [<c0103773>] kernel_thread_helper+0x7/0x14
 =======================
Code:  Bad EIP value.
EIP: [<f8b1ac60>] 0xf8b1ac60 SS:ESP 0068:f6dd1e90

Can you tell me the reason? or solution?
I have some experience with lustre but i'm very confussed with the new 
version.

Thank you in advance!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pedro.lorente.vcf
Type: text/x-vcard
Size: 292 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090518/145479e2/attachment.vcf>


More information about the lustre-discuss mailing list