[Lustre-discuss] lo2iblnd and Mellanox IB question
jeff.johnson at aeoncomputing.com
Wed Nov 21 09:36:23 PST 2012
You will have to rebuild Lustre from source. Furthermore you will have
to have the Mellanox ib driver source installed so the Lustre build
process can grab the necessary bits from the Mellanox source.
The issue you are seeing is exactly what you think it is. The WC builds
use the RHEL in-kernel IB driver. I have even had issues with MDS/OSS
boxes running RHEL in-kernel IB and clients running Mellanox of OFED IB
drivers. Even though IB is a "standard" you really need to have
everything, from core to edge, talking the same driver.
I recently did nearly the same config you have; RHEL6.2 x86_64, MLX
OFED, Lustre 2.1.3.
You could opt to run your Mellanox IB HCA using the RHEL in-kernel IB
drivers and not have to recompile anything.
On 11/20/12 1:20 PM, Ms. Megan Larko wrote:
> Hello to Everyone!
> I have a question to which I think I know the answer, but I am seeking
> confirmation (re-assurance?).
> I have build a RHEL 6.2 system with lustre-2.1.2. I am using the
> rpms from the Whamcloud site for linux kernel
> 2.6.32_220.17.1.el6_lustre.x85_64 along with the version-matching
> lustre, lustre-modules, lustre-ldiskfs, and kernel-devel, I also
> have from the Whamcloud site
> kernel-ib-1.8.5-2.6.32-220.17.1.el6_lustre.x86_64 and the related
> kernel-ib-devel for same.
> The lustre file system works properly for TCP.
> I would like to use InfiniBand. The system has a new Mellanox card
> for which mlxn1 firmware and drivers were installed. After this was
> done (I cannot speak to before) the IB network will come up on boot
> and copy and ping in a traditional network fashion.
> Hard Part: I would like to run the lustre file system on the IB (ib0).
> I re-created the lustre network to use /etc/modprobe.d/lustre.conf
> pointing to o2ib in place of tcp0. I rebuilt the mgs/mdt and all
> osts to use the IB network (the mgs/mds --failnode=[new_IB_addr] and
> the osts point to mgs on IB net). When I "modprobe lustre" to start
> the system I receive error messages stating that there are
> Input/Output errors on lustre modules fld.ko, fid,ko, mdc.ko osc.ko
> lov.ko. The lustre.ko cannot be started. A look in
> /var/log/messages reveals many "Unknown symbol" and "Disagrees about
> version of symbol" from the ko2iblnd module.
> A "modprobe --dump-modversions /path/to/kernel/lo2iblnd.ko" shows it
> pointing to the Modules.symvers of the lustre kernel.
> Am I correct in thinking that because of the specific Mellanox IB
> hardware I have (with its own /usr/src/ofa_kernel/Module.symvers
> file), that I have to build Lustre-2.1.2 from tarball to use the
> "configure --with-o2ib=/usr/src/ofa_kernel...." mandating that this
> system use the ofa_kernel-1.8.5 modules and not the OFED 1.8.5 from
> the kernel-ib rpms to which Lustre defaults in the Linux kernel?
> Is a rebuild of lustre from source mandartory or is there a way in
> which I may point to the appropriate symbols needed by the
> Enjoy the Thanksgiving holiday for those U.S. readers. To everyone
> else in the world, have a great weekend!
> Megan Larko
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
jeff.johnson at aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845
/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117
More information about the lustre-discuss