[Lustre-discuss] lo2iblnd and Mellanox IB question

Jerome, Ron Ron.Jerome at ssc-spc.gc.ca
Tue Nov 20 14:15:40 PST 2012


I've had to rebuild against the Mellanox OFED every time I change Lustre or OFED versions.  It's a bit of a catch 22 situation because you have to build the Mellanox OFED against the Lustre kernel, install the Mellanox OFED, then rebuild the Lustre modules against the Mellanox OFED.  The procedure I use is as follows...

* install upgraded Lustre kernel and kernel-devel rpms
* rebuild Mellanox OFED against Lustre kernel 
	- mount -o loop MLNX_OFED.iso /root/mnt
	- /root/mnt/docs/mlnx_add_kernel_support.sh -i /root/MLNX_OFED.iso
* install Mellanox OFED from rebuilt  MLNX_OFED.iso 
* install kernel-ib-devel from rebuilt MLNX_OFED.iso 

Now rebuld lustre-modules RPM to get ko2iblnd.ko which is compatible with Mellanox kernel-ib drivers...

* cd /usr/src/lustre-x.x.x
* configure --with-o2ib=/usr/src/openib  
* make  rpms


Ron. 
-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Ms. Megan Larko
Sent: November 20, 2012 4:21 PM
To: Lustre User Discussion Mailing List
Subject: [Lustre-discuss] lo2iblnd and Mellanox IB question

Hello to Everyone!

I have a question to which I think I know the answer, but I am seeking
confirmation (re-assurance?).

I have build a RHEL 6.2 system with lustre-2.1.2.   I am using the
rpms from the Whamcloud site for linux kernel
2.6.32_220.17.1.el6_lustre.x85_64 along with the version-matching
lustre,  lustre-modules, lustre-ldiskfs, and kernel-devel,    I also
have from the Whamcloud site
kernel-ib-1.8.5-2.6.32-220.17.1.el6_lustre.x86_64 and the related
kernel-ib-devel for same.

The lustre file system works properly for TCP.

I would like to use InfiniBand.   The system has a new Mellanox card
for which mlxn1 firmware and drivers were installed.   After this was
done (I cannot speak to before) the IB network will come up on boot
and copy and ping in a traditional network fashion.

Hard Part:  I would like to run the lustre file system on the IB (ib0).
I re-created the lustre network to use /etc/modprobe.d/lustre.conf
pointing to o2ib in place of tcp0.   I rebuilt the mgs/mdt and all
osts to use the IB network (the mgs/mds --failnode=[new_IB_addr] and
the osts point to mgs on IB net).   When I "modprobe lustre" to start
the system I receive error messages stating that there are
Input/Output errors on lustre modules fld.ko, fid,ko, mdc.ko osc.ko
lov.ko.   The lustre.ko cannot be started.   A look in
/var/log/messages reveals many "Unknown symbol" and "Disagrees about
version of symbol"  from the ko2iblnd module.

A "modprobe --dump-modversions /path/to/kernel/lo2iblnd.ko"  shows it
pointing to the Modules.symvers of the lustre kernel.

Am I correct in thinking that because of the specific Mellanox IB
hardware I have (with its own /usr/src/ofa_kernel/Module.symvers
file), that I have to build Lustre-2.1.2 from tarball to use the
"configure --with-o2ib=/usr/src/ofa_kernel...."  mandating that this
system use the ofa_kernel-1.8.5  modules and not the OFED 1.8.5 from
the kernel-ib rpms  to which Lustre defaults in the Linux kernel?

Is a rebuild of lustre from source mandartory or is there a way in
which I may point to the appropriate symbols needed by the
ko2iblnd.ko?

Enjoy the Thanksgiving holiday for those U.S. readers.    To everyone
else in the world, have a great weekend!

Megan Larko
Hewlett-Packard
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list