[Lustre-discuss] SLES 11 SP1 Client rpms built but not working
Andreas Dilger
adilger at whamcloud.com
Tue May 10 13:48:00 PDT 2011
On May 9, 2011, at 11:38, <peter.chiu at stfc.ac.uk> <peter.chiu at stfc.ac.uk> wrote:
> The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints.
>
> But the subsequent “modprobe lustre” will return a “Killed” message, with no lustre module loaded.
>
> dmesg also reveals “BUG: unable to handle kernel NULL pointer dereference at 0000000000000008”
>
> A second modprobe lustre command will then hang, again with no module loaded.
> Subsequently the client is not able to mount the lustre storage.
>
> Can anyone shed some light as to what has gone wrong here please?
>
> ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen
Are you sure that "/usr/src/linux" points to the same source as "/usr/src/linux-2.6.32.29-0.3-obj"? Is that a symlink? Normally the source and -obj files have a very similar pathname (i.e. just with "-obj" suffix difference).
> > [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1
> 0000000000000400
> > [ 168.648110] Process modprobe (pid: 3445, threadinfo ffff88007efa4000, task ffff88007e9100c0)
> > [ 168.648129] Call Trace:
> > [ 168.648138] [<ffffffff80038588>] try_to_wake_up+0x48/0x420
> > [ 168.648143] [<ffffffff8005b2e8>] up+0x48/0x50
> > [ 168.648153] [<ffffffffa0230d92>] LNetInit+0x92/0xc0 [lnet]
> > [ 168.648167] [<ffffffffa02430ac>] init_lnet+0x4c/0x280 [lnet]
> > [ 168.648178] [<ffffffff80004045>] do_one_initcall+0x35/0x1b0
> > [ 168.648184] [<ffffffff8006d154>] sys_init_module+0xe4/0x270
> > [ 168.648189] [<ffffffff80007458>] system_call_fastpath+0x16/0x1b
> > [ 168.648194] [<00007f3f40bc9f7a>] 0x7f3f40bc9f7a
>
> I have tried Lustre-1.8.4, but got the same result.
> I have also tried to follow the 1.8 Operations Manual to locate the diagnostic tools, but the link wiki.lustre.org is no longer valid.
This looks like a pretty serious error to oops during module insertion, and I'd suspect the build environment before any particular Lustre code.
Cheers, Andreas
--
Andreas Dilger
Principal Engineer
Whamcloud, Inc.
More information about the lustre-discuss
mailing list