[Lustre-discuss] lo2iblnd and Mellanox IB question

Jeff Johnson jeff.johnson at aeoncomputing.com
Wed Nov 21 09:36:23 PST 2012


Megan,

You will have to rebuild Lustre from source. Furthermore you will have 
to have the Mellanox ib driver source installed so the Lustre build 
process can grab the necessary bits from the Mellanox source.

The issue you are seeing is exactly what you think it is. The WC builds 
use the RHEL in-kernel IB driver. I have even had issues with MDS/OSS 
boxes running RHEL in-kernel IB and clients running Mellanox of OFED IB 
drivers. Even though IB is a "standard" you really need to have 
everything, from core to edge, talking the same driver.

I recently did nearly the same config you have; RHEL6.2 x86_64, MLX 
OFED, Lustre 2.1.3.

You could opt to run your Mellanox IB HCA using the RHEL in-kernel IB 
drivers and not have to recompile anything.

--Jeff


On 11/20/12 1:20 PM, Ms. Megan Larko wrote:
> Hello to Everyone!
>
> I have a question to which I think I know the answer, but I am seeking
> confirmation (re-assurance?).
>
> I have build a RHEL 6.2 system with lustre-2.1.2.   I am using the
> rpms from the Whamcloud site for linux kernel
> 2.6.32_220.17.1.el6_lustre.x85_64 along with the version-matching
> lustre,  lustre-modules, lustre-ldiskfs, and kernel-devel,    I also
> have from the Whamcloud site
> kernel-ib-1.8.5-2.6.32-220.17.1.el6_lustre.x86_64 and the related
> kernel-ib-devel for same.
>
> The lustre file system works properly for TCP.
>
> I would like to use InfiniBand.   The system has a new Mellanox card
> for which mlxn1 firmware and drivers were installed.   After this was
> done (I cannot speak to before) the IB network will come up on boot
> and copy and ping in a traditional network fashion.
>
> Hard Part:  I would like to run the lustre file system on the IB (ib0).
> I re-created the lustre network to use /etc/modprobe.d/lustre.conf
> pointing to o2ib in place of tcp0.   I rebuilt the mgs/mdt and all
> osts to use the IB network (the mgs/mds --failnode=[new_IB_addr] and
> the osts point to mgs on IB net).   When I "modprobe lustre" to start
> the system I receive error messages stating that there are
> Input/Output errors on lustre modules fld.ko, fid,ko, mdc.ko osc.ko
> lov.ko.   The lustre.ko cannot be started.   A look in
> /var/log/messages reveals many "Unknown symbol" and "Disagrees about
> version of symbol"  from the ko2iblnd module.
>
> A "modprobe --dump-modversions /path/to/kernel/lo2iblnd.ko"  shows it
> pointing to the Modules.symvers of the lustre kernel.
>
> Am I correct in thinking that because of the specific Mellanox IB
> hardware I have (with its own /usr/src/ofa_kernel/Module.symvers
> file), that I have to build Lustre-2.1.2 from tarball to use the
> "configure --with-o2ib=/usr/src/ofa_kernel...."  mandating that this
> system use the ofa_kernel-1.8.5  modules and not the OFED 1.8.5 from
> the kernel-ib rpms  to which Lustre defaults in the Linux kernel?
>
> Is a rebuild of lustre from source mandartory or is there a way in
> which I may point to the appropriate symbols needed by the
> ko2iblnd.ko?
>
> Enjoy the Thanksgiving holiday for those U.S. readers.    To everyone
> else in the world, have a great weekend!
>
> Megan Larko
> Hewlett-Packard
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117




More information about the lustre-discuss mailing list