[Lustre-discuss] Lustre v2.1 RHEL 6.1 build does not work

Jon Zhu jon.zhu at gmail.com
Fri Dec 16 18:42:35 PST 2011


Hi, Oleg

I just rebuilt the SUSE Lustre 2.1 client under xen kernel environment, no
problem with the rebuild process, but I installed the package on the build
machine got the following errors:

 ip-10-32-62-48://usr/src/packages/RPMS/x86_64 # modprobe lnet
FATAL: Error inserting lnet
(/lib/modules/2.6.32.46-0.3-ec2/updates/kernel/net/lustre/lnet.ko): Invalid
module format
ip-10-32-62-48://usr/src/packages/RPMS/x86_64 # uname -a
Linux ip-10-32-62-48 2.6.32.46-0.3-ec2 #1 SMP 2011-09-29 17:49:31 +0200
x86_64 x86_64 x86_64 GNU/Linux
ip-10-32-62-48://usr/src/packages/RPMS/x86_64 # modinfo
/lib/modules/2.6.32.46-0.3-ec2/updates/kernel/net/lustre/lnet.ko
filename:
/lib/modules/2.6.32.46-0.3-ec2/updates/kernel/net/lustre/lnet.ko
license:        GPL
description:    Portals v3.1
author:         Peter J. Braam <braam at clusterfs.com>
srcversion:     D64519A3761B3BB8EDF41F1
depends:        libcfs
vermagic:       2.6.32.46-0.3-ec2 SMP mod_unload modversions Xen
parm:           accept:Accept connections (secure|all|none) (charp)
parm:           accept_port:Acceptor's port (same on all nodes) (int)
parm:           accept_backlog:Acceptor's listen backlog (int)
parm:           accept_timeout:Acceptor's timeout (seconds) (int)
parm:           forwarding:Explicitly enable/disable forwarding between
networks (charp)
parm:           tiny_router_buffers:# of 0 payload messages to buffer in
the router (int)
parm:           small_router_buffers:# of small (1 page) messages to buffer
in the router (int)
parm:           large_router_buffers:# of large messages to buffer in the
router (int)
parm:           peer_buffer_credits:# router buffer credits per peer (int)
parm:           auto_down:Automatically mark peers down on comms error (int)
parm:           check_routers_before_use:Assume routers are down and ping
them before use (int)
parm:           avoid_asym_router_failure:Avoid asymmetrical failures:
reserved, use at your own risk (int)
parm:           dead_router_check_interval:Seconds between dead router
health checks (<= 0 to disable) (int)
parm:           live_router_check_interval:Seconds between live router
health checks (<= 0 to disable) (int)
parm:           router_ping_timeout:Seconds to wait for the reply to a
router health query (int)
parm:           config_on_load:configure network at module load (int)
parm:           local_nid_dist_zero:Reserved (int)
parm:           ip2nets:LNET network <- IP table (charp)
parm:           networks:local networks (charp)
parm:           routes:routes to non-local networks (charp)

Any idea on this? I do see some error message in system log as well:
Dec 17 01:51:41 ip-10-32-62-48 kernel: [ 3744.865201] libcfs: no symbol
version for module_layout

Thanks,
-Jon.


On Wed, Dec 7, 2011 at 1:24 AM, Oleg Drokin <green at whamcloud.com> wrote:

> Hello!
>
>   Yes, we do plan to have 3.x support. The 2.6.38 support is almost
> finished and we will start looking at 3.x next.
>
> Bye,
>    Oleg
> On Nov 29, 2011, at 3:11 PM, Jon Zhu wrote:
>
> > Hi, Oleg
> >
> > Does Lustre 2.1 support Ubuntu client? From the following build map I
> can see only ubuntu 10.04 client is supported but kernel needs to be
> patched. Is there any plan to support patchless kernel client on Ubuntu?
> The new Ubuntu Oneiric 11.10 release kernel 3.0 is being used, any plan to
> support that in the near future?
> > http://build.whamcloud.com/job/lustre-b2_1/?
> >
> > Thanks,
> > -Jon.
> > jon.zhu at gmail.com
> >
> > On Sun, Oct 2, 2011 at 4:31 PM, Jon Zhu <jon.zhu at gmail.com> wrote:
> > Thanks a lot, the work around works.
> >
> > -Jon.
> >
> >
> >
> >
> > On Sun, Oct 2, 2011 at 3:47 PM, Oleg Drokin <green at whamcloud.com> wrote:
> > Hello!
> >
> >    Last time I hit this (some years ago), a simple touch
> ldiskfs/Module.symvers helped. I don't remember what the issue was or how
> it was properly fixed, though.
> >
> > Bye,
> >    Oleg
> > On Oct 2, 2011, at 1:57 PM, Jon Zhu wrote:
> >
> > > Hi, Oleg
> > >
> > > I encountered the following error while building Lustre 2.1 release on
> Redhat 2.1, do you have any idea on this error?
> > >
> > > make[1]: *** No rule to make target
> `/build/lustre-release/ldiskfs/Module.symvers', needed by `Module.symvers'.
>  Stop.
> > >
> > >
> > > Full build log :
> > > ....
> > > + /usr/lib/rpm/redhat/brp-java-repack-jars
> > > Processing files: lustre-iokit-1.2-201110021351.noarch
> > > Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.IOpHyP
> > > + umask 022
> > > + cd /build/kernel/rpmbuild/BUILD
> > > + cd lustre-iokit-1.2
> > > +
> DOCDIR=/build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + export DOCDIR
> > > + rm -rf
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + /bin/mkdir -p
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + cp -pr obdfilter-survey/README.obdfilter-survey
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + cp -pr ior-survey/README.ior-survey
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + cp -pr ost-survey/README.ost-survey
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + cp -pr sgpdd-survey/README.sgpdd-survey
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + cp -pr stats-collect/README.lstats.sh/build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64/usr/share/doc/lustre-iokit-1.2
> > > + exit 0
> > > Provides: lustre-iokit = 1.2
> > > Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1
> rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
> rpmlib(VersionedDependencies) <= 3.0.3-1
> > > Requires: /bin/bash /bin/sh /usr/bin/perl perl(File::Path)
> perl(Getopt::Long) perl(Getopt::Std) perl(POSIX)
> > > Checking for unpackaged file(s): /usr/lib/rpm/check-files
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64
> > > Wrote:
> /build/kernel/rpmbuild/SRPMS/lustre-iokit-1.2-201110021351.src.rpm
> > > Wrote:
> /build/kernel/rpmbuild/RPMS/noarch/lustre-iokit-1.2-201110021351.noarch.rpm
> > > Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.0eXePd
> > > + umask 022
> > > + cd /build/kernel/rpmbuild/BUILD
> > > + cd lustre-iokit-1.2
> > > + /bin/rm -rf
> /build/kernel/rpmbuild/BUILDROOT/lustre-iokit-1.2-201110021351.x86_64
> > > + exit 0
> > > make[1]: Leaving directory `/build/lustre-release/lustre-iokit'
> > > Finished rpms in lustre-iokit
> > > make[1]: Entering directory `/build/lustre-release'
> > > make[1]: *** No rule to make target
> `/build/lustre-release/ldiskfs/Module.symvers', needed by `Module.symvers'.
>  Stop.
> > > make[1]: Leaving directory `/build/lustre-release'
> > > make: *** [rpms] Error 2
> > >
> > >
> > > Thanks,
> > > -Jon.
> > >
> > >
> > > On Thu, Sep 29, 2011 at 11:34 PM, Oleg Drokin <green at whamcloud.com>
> wrote:
> > > Hello!
> > >
> > >   There is nothing special, same as rhel6.1:
> > >   unpack the lustre source, run autogen.sh, run configure and provide
> the path to the linux kernel source for your distro (need to patch first
> too), make.
> > >
> > > Bye,
> > >    Oleg
> > > On Sep 29, 2011, at 11:21 PM, Jon Zhu wrote:
> > >
> > > > Hi, Oleg
> > > >
> > > > Do we have a procedure on how to build v2.1 GA code on CentOS 5.6
> (xen)? On whamcloud wiki I can only find build v2.1 on RHEL 6.1 or build
> v1.8 on CentOS 5.6.
> > > >
> > > > BTW, congratulations on the 2.1 release!
> > > >
> > > > Regards,
> > > >
> > > > Jon Zhu
> > > > Sent from Google Mail
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Jun 24, 2011 at 2:43 PM, Oleg Drokin <green at whamcloud.com>
> wrote:
> > > > Hwllo~
> > > >
> > > > On Jun 23, 2011, at 9:51 PM, Jon Zhu wrote:
> > > >
> > > > > I still got some crash when further run some I/O test with the
> build, here's some system message containing call stack info maybe be
> useful to you to find the bug:
> > > >
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: ------------[ cut here
> ]------------
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: WARNING: at
> kernel/sched.c:7087 __cond_resched_lock+0x8e/0xb0() (Not tainted)
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: Modules linked in:
> lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U)
> ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) sha256_generic
> cryptd aes_x86_64 aes_generic cbc dm_crypt autofs4 ipv6 microcode
> xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mod [last unloaded:
> scsi_wait_scan]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: Pid: 1421, comm:
> mount.lustre Not tainted 2.6.32.lustre21 #6
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: Call Trace:
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81069c37>] ?
> warn_slowpath_common+0x87/0xc0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81007671>] ?
> __raw_callee_save_xen_save_fl+0x11/0x1e
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81069c8a>] ?
> warn_slowpath_null+0x1a/0x20
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff810654fe>] ?
> __cond_resched_lock+0x8e/0xb0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811a53b7>] ?
> shrink_dcache_for_umount_subtree+0x187/0x340
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811a55a6>] ?
> shrink_dcache_for_umount+0x36/0x60
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118f4ff>] ?
> generic_shutdown_super+0x1f/0xe0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118f5f1>] ?
> kill_block_super+0x31/0x50
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811906b5>] ?
> deactivate_super+0x85/0xa0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811ac5af>] ?
> mntput_no_expire+0xbf/0x110
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0273f8e>] ?
> unlock_mntput+0x3e/0x60 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0277a98>] ?
> server_kernel_mount+0x268/0xe80 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280d40>] ?
> lustre_fill_super+0x0/0x1290 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0279070>] ?
> lustre_init_lsi+0xd0/0x5b0 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff810ac71d>] ?
> lock_release+0xed/0x220
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280fd0>] ?
> lustre_fill_super+0x290/0x1290 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118ee20>] ?
> set_anon_super+0x0/0x110
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280d40>] ?
> lustre_fill_super+0x0/0x1290 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8119035f>] ?
> get_sb_nodev+0x5f/0xa0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0272885>] ?
> lustre_get_sb+0x25/0x30 [obdclass]
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118ffbb>] ?
> vfs_kern_mount+0x7b/0x1b0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81190162>] ?
> do_kern_mount+0x52/0x130
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811ae647>] ?
> do_mount+0x2e7/0x870
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811aec60>] ?
> sys_mount+0x90/0xe0
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8100b132>] ?
> system_call_fastpath+0x16/0x1b
> > > > > Jun 23 21:46:12 ip-10-112-59-173 kernel: ---[ end trace
> a8fb737c71bfba13 ]---
> > > >
> > > > This is not a crash, it's just a warning about scheduling in
> inappropriate context I guess, but the kernel will continue to work.
> > > > Interesting that I have never seen anything like that in rhel5 xen
> kernels, perhaps it's something with rhel6.1 xen?
> > > >
> > > > Bye,
> > > >    Oleg
> > > > --
> > > > Oleg Drokin
> > > > Senior Software Engineer
> > > > Whamcloud, Inc.
> > > >
> > > >
> > >
> > > --
> > > Oleg Drokin
> > > Senior Software Engineer
> > > Whamcloud, Inc.
> > >
> > >
> >
> > --
> > Oleg Drokin
> > Senior Software Engineer
> > Whamcloud, Inc.
> >
> >
> >
>
> --
> Oleg Drokin
> Senior Software Engineer
> Whamcloud, Inc.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20111216/64569268/attachment.htm>


More information about the lustre-discuss mailing list