[lustre-discuss] Lustre-2.10.5 problem

Tung-Han Hsieh thhsieh at twcp1.phys.ntu.edu.tw
Tue Sep 25 02:33:01 PDT 2018


Dear Andreas,

Thank you very much for your kindly reply.

When I run "modprobe lustre", dmesg only tells:

[191843.804416] LNet: HW NUMA nodes: 2, HW CPU cores: 24, npartitions: 2
[191844.582597] Lustre: Lustre: Build Version: 2.10.0

and I got "ERROR: could not insert 'lustre': No such device"
command line message. If I check "lsmod", I saw the following
lustre modules loaded:

Module                  Size  Used by
lnet                  388690  0 
libcfs                214791  1 lnet

When I run "modprobe obdclass", the result is exactly the same.

I also tried to recompile Lustre-2.10.5 with the options:

    ./configure --prefix=/opt/lustre \
                --with-linux=/path/to/linux-3.12.72 \
                --disable-server

to make the situation simpler. But I still get exactly the same error.

BTW., my Linux OS is Debian 8.10, with kmod version 18-3, udev
version 215-17+deb8u7, linux kernel 3.12.72, gcc-4.9.2.

==========================================================================

Then I am wondering that whether this error is due to the version of
Linux OS ? So I tried to compile Lustre-2.10.5 again with the option:

    ./configure --prefix=/opt/lustre \
                --with-linux=/path/to/linux-4.9.110 \
                --disable-server

on a newer machine: Linux OS Debian 9.5, with kmod version 23-2,
udev version 232-25+deb9u4, linux kernel 4.9.110, gcc-4.9.2. I need
to comment out a few lines like:

    .setxattr       = ll_setxattr,
    .getxattr       = ll_getxattr,
    .listxattr      = ll_listxattr,
    .removexattr    = ll_removexattr,

in "lustre/llite/symlink.c", "lustre/llite/namei.c", and
"lustre/llite/file.c" in order to successfully build the lustre source
code. This time I can successfully run:

	modprobe lustre

So, does it due to my Linux system (or utilities) too old ? Is there
a list of "System Requirements" to run Lustre-2.10.5 ?

ps. I suggest that the "System Requirements" should be documented in
    the release note of the Lustre software. Actually, everytime when
    I want to upgrade Lustre system in my clusters, I always have to
    spend a lot of time to *guess* the correct version combination of
    the system, the 3rd party libraries (e.g., ZFS), and Lustre itself, ....,
    etc to make everything work. Unfortunately all these information
    are not always easy to find.


Best Regards,

T.H.Hsieh



On Tue, Sep 25, 2018 at 07:38:00AM +0000, Andreas Dilger wrote:
> What does dmesg tell you?  Normally it will report some module has incorrect symbols, which means you compiled against a different version of the kernel source. OFED/MOFED libraries, etc.
> 
> > On Sep 25, 2018, at 05:14, Tung-Han Hsieh <thhsieh at twcp1.phys.ntu.edu.tw> wrote:
> > 
> > Dear All,
> > 
> > I found that my lustre-2.10.5 with ZFS (either 0.7.9 or 0.7.11)
> > cannot load the "lustre" modules because it cannot load the
> > "obdclass.ko" module. The error message is the following:
> > 
> > # modprobe -v -v obdclass
> > insmod /lib/modules/3.12.72/updates/fs/lustre/obdclass.ko
> > libkmod: INFO ../libkmod/libkmod-module.c:829 kmod_module_insert_module: Failed to insert module '/lib/modules/3.12.72/updates/fs/lustre/obdclass.ko': No such device
> > ERROR: could not insert 'obdclass': No such device
> > libkmod: INFO ../libkmod/libkmod.c:319 kmod_unref: context 0x7fb945d321e0 released
> > 
> > Could anyone suggest how to debug ?
> > 
> > Thanks very much.
> > 
> > 
> > T.H.Hsieh
> > 
> > 
> > On Tue, Sep 25, 2018 at 12:14:00AM +0800, Tung-Han Hsieh wrote:
> >> Dear Nathaniel,
> >> 
> >> Thank you very much for your kindly reply. Indeed I modified the
> >> lustre-2.10.5 codes:
> >> 
> >>    lustre/osd-zfs/osd_object.c
> >>    lustre/osd-zfs/osd_xattr.c
> >> 
> >> for the declaration:
> >> 
> >>    inode_timespec_t now;
> >> 
> >> Similar to what you have done in your patch. So I can compile
> >> lustre-2.10.5 cleanly with zfs-0.7.11. Sorry I forgot to mention.
> >> 
> >> But my problem is still there. Actually I just tried:
> >> 
> >> 1. Applying your patch to the original lustre-2.10.5 code, and
> >>   recompile with spl-0.7.11 and zfs-0.7.11. But loading "lustre"
> >>   module still gives "no such device" error.
> >> 
> >> 2. I recompile the original lustre-2.10.5 with spl-0.7.9 and
> >>   zfs-0.7.9. They can be compiled cleanly. But again I got the
> >>   "no such device" error when loading "lustre" module.
> >> 
> >> I am wondering that I must overlooked a trivial step, something
> >> like one (or some) of the utilities in /opt/lustre/sbin/* should
> >> be linked to /sbin/ or /usr/sbin/ ....
> >> 
> >> Any suggestions are very appreciated.
> >> 
> >> Thank you very much.
> >> 
> >> 
> >> T.H.Hsieh
> >> 
> >> 
> >> On Mon, Sep 24, 2018 at 01:21:19PM +0000, Nathaniel Clark wrote:
> >>> Hello Tung-Han,
> >>> 
> >>> ZFS 0.7.11 doesn’t compile cleanly with Lustre, yet.
> >>> 
> >>> There’s a ticket for adding ZFS 0.7.11 support to lustre:
> >>> https://jira.whamcloud.com/browse/LU-11393
> >>> 
> >>> It has patches for master (pre-2.12) and a separate patch for 2.10.
> >>> 
> >>> —
> >>> Nathaniel Clark <nclark at whamcloud.com<mailto:nclark at whamcloud.com>>
> >>> Senior Engineer
> >>> Whamcloud / DDN
> >>> 
> >>> On Sep 24, 2018, at 2:15 PM, Tung-Han Hsieh <thhsieh at twcp1.phys.ntu.edu.tw<mailto:thhsieh at twcp1.phys.ntu.edu.tw>> wrote:
> >>> 
> >>> Dear All,
> >>> 
> >>> I am trying to install Lustre version 2.10.5 with ZFS-0.7.11
> >>> from source code. After compilation and installation, I tried
> >>> to load the "lustre" module, but encountered the following
> >>> error:
> >>> 
> >>> # modprobe lustre
> >>> could not load module 'lustre': no such device
> >>> 
> >>> My procedure of installation is the following:
> >>> 
> >>> 1. Compile vanilla kernel 3.12.72 downloaded from:
> >>>  https://mirrors.edge.kernel.org/pub/linux/kernel/v3.x/linux-3.12.72.tar.gz
> >>> 
> >>> 2. Compile spl-0.7.11 downloaded from:
> >>>  https://github.com/zfsonlinux/zfs/releases/download/zfs-0.7.11/spl-0.7.11.tar.gz
> >>> 
> >>>  with the following steps:
> >>>  # ./configure --prefix=/opt/lustre --with-linux=/path/to/linux-3.12.72
> >>>  # make
> >>>  # make install
> >>> 
> >>> 3. Compile zfs-0.7.11 downloaded from:
> >>>  https://github.com/zfsonlinux/zfs/releases/download/zfs-0.7.11/zfs-0.7.11.tar.gz
> >>> 
> >>>  with the following steps:
> >>>  # ./configure --prefix=/opt/lustre \
> >>>                --with-linux=/path/to/linux-3.12.72 \
> >>>                --with-spl=/path/to/spl-0.7.11
> >>>  # make
> >>>  # make install
> >>> 
> >>> 4. Compile lustre downloaded from:
> >>>  https://downloads.whamcloud.com/public/lustre/lustre-2.10.5/sles12sp3/client/SRPMS/lustre-2.10.5-1.src.rpm
> >>> 
> >>>  Then I unpack the SRPM by the command:
> >>>  # rpm2cpio lustre-2.10.5-1.src.rpm | cpio --extract --make-directories
> >>> 
> >>>  and compile it by the following:
> >>>  # ./configure --prefix=/opt/lustre \
> >>>                --with-linux=/path/to/linux-3.12.72 \
> >>>                --with-spl=/path/to/spl-0.7.11 \
> >>>                --with-zfs=/path/to/zfs-0.7.11 \
> >>>                --with-o2ib=no \
> >>>                --disable-ldiskfs
> >>>  # make
> >>>  # make install
> >>> 
> >>> 5. I have made sure the following settings and utilities are correct:
> >>>  - PATH contains /opt/lustre/bin and /opt/lustre/sbin
> >>>  - /sbin/mount.lustre exists.
> >>>  - /sbin/mount.zfs exists.
> >>>  - /usr/sbin/l_getidentity exists.
> >>>  - /usr/sbin/ko2iblnd-probe exists.
> >>>  - /etc/modprobe.d/lustre.conf contains:
> >>>    options lnet networks=tcp
> >>>  - /etc/modprobe.d/ko2iblnd.conf contains:
> >>>    alias ko2iblnd-opa ko2iblnd
> >>>    options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1
> >>>    install ko2iblnd /usr/sbin/ko2iblnd-probe
> >>> 
> >>> Then I tried to run "modprobe lustre", it says "no such device" error.
> >>> 
> >>> I tried to replace Lustre-2.10.5 by Lustre-2.9 downloaded from:
> >>> 
> >>> https://downloads.whamcloud.com/public/lustre/lustre-2.9.0/sles12sp1/client/SRPMS/lustre-2.9.0-1.src.rpm
> >>> 
> >>> and proceed exactly the same installation steps. Everything works fine.
> >>> 
> >>> Could anyone suggest me what have I missed for lustre-2.10.5 ? Or suggest
> >>> me how to debug.
> >>> 
> >>> Thanks very much.
> >>> 
> >>> 
> >>> T.H.Hsieh
> >>> _______________________________________________
> >>> lustre-discuss mailing list
> >>> lustre-discuss at lists.lustre.org
> >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >>> 
> >> _______________________________________________
> >> lustre-discuss mailing list
> >> lustre-discuss at lists.lustre.org
> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> Cheers, Andreas
> ---
> Andreas Dilger
> CTO Whamcloud
> 
> 
> 
> 




More information about the lustre-discuss mailing list