[Lustre-discuss] Installing Lustre 1.8.2 on CentOS 5.4 with OFED 1.4.2

Marco Aurelio L Gomes mgomes at tpn.usp.br
Mon Mar 22 08:44:21 PDT 2010


Hi Michael,

Thanks in advance for your reply. I've followed your instructions and I
compiled OFEd 1.4.2 succesfully, but when I tried to modprobe lustre, i
get errors complaining about symbols versions

[root at masternode1 modules]# modprobe lustre
WARNING: Error inserting osc
(/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/osc.ko): Input/output
error
WARNING: Error inserting mdc
(/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/mdc.ko): Input/output
error
WARNING: Error inserting lov
(/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lov.ko): Input/output
error
FATAL: Error inserting lustre
(/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lustre.ko):
Input/output error

The contents of dmesg are in attached file.

There are another step that I need to follow to get lustre 1.8.2 working
at kernel 2.6.18-164.11.1.el5 and OFED 1.4.2? And it's possible to get
lustre 1.8.2 working with OFED 1.5?

Regards,


Regards,

Marco Gomes
Systems/HPC-Cluster
Numerical Offshore Tank
+55 11 3777-4142 r.250
+55 11 3091-5350 r.250

On Sun, 2010-03-21 at 12:01 -0600, 
Message: 1
Date: Sun, 21 Mar 2010 19:00:46 +0100
From: Michael Mayer <mmayer at hpce.nec.com>
Subject: Re: [Lustre-discuss] Installing Lustre 1.8.2 on CentOS 5.4
	with OFED 1.4.2
To: lustre-discuss at lists.lustre.org
Message-ID: <4BA65ECE.1010309 at hpce.nec.com>
Content-Type: text/plain; charset="utf-8"

Hi Marco,

OFED 1.4.2 and the RHEL5.4 kernel do not work because OFED 1.4.2 lacks 
official RHEL5.4 support (I think OFED 1.4.2 was even released before 
RHEL5.4 so it only has backports up to and including RHEL 5.3. The 
RHEL5.4 kernels do not differ much from the RHEL5.3 but became a bit 
more upstream so a couple of the RHEL5.3 backports of OFED 1.4.2 are
not 
needed any longer.

We had the same problem here and upon inspection of the error messages 
it turns out that the errors in the OFED 1.4.2 compilation can be fixed 
in a rather simple way: I have created an OFED patch and a modified
spec 
file (attached).

So, please install the default ofa_kernel source rpm in your system, 
replace the spec file and copy the patch file to the SOURCES directory, 
then run a "rpmbuild -bs ofa_kernel.spec" and finally replace the 
default source rpm with the new ofa_kernel source rpm.

After that, OFED compilation should work without errors.

Cheers,

Michael.

On 20/03/10 00:31, Marco Aurelio L Gomes wrote:
> Hi all,
>
> I'm trying to setup my Lustre environment using CentOS 5.4
> (kernel-2.6.18-164.11.1.el5-x86_64 from updates repository) Lustre
1.8.2
> and OFED 1.4.2. I saw at Lustre Matrix Support that 1.8.2 works at
most
> with OFED 1.4.2, but when I tried to compile with this release, I saw
a
> lot of errors complaining about variables redefinition at ofa_kernel
> compiling process (see attached ofa_kernel_rpmbuild.log. but i think
> this is not related to this list). In this case, i tried to compile
with
> OFED 1.5, the latest stable release of OFED, that compile fine; but
when
> i boot a client, and run:
>
> modprobe lustre
>
> i got the following errors:
>
> [root at masternode1 ~]# modprobe lustre
> WARNING: Error inserting osc
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/osc.ko):
Input/output
> error
> WARNING: Error inserting mdc
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/mdc.ko):
Input/output
> error
> WARNING: Error inserting lov
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lov.ko):
Input/output
> error
> FATAL: Error inserting lustre
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lustre.ko):
> Input/output error
>
> and at dmesg:
>
> Lustre: OBD class driver, http://www.lustre.org/
> Lustre:     Lustre Version: 1.8.2
> Lustre:     Build Version:
> 1.8.2-20100122201357-PRISTINE-2.6.18-164.11.1.el5
> ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap
> ko2iblnd: Unknown symbol ib_fmr_pool_unmap
> ko2iblnd: disagrees about version of symbol ib_create_cq
> ko2iblnd: Unknown symbol ib_create_cq
> ko2iblnd: disagrees about version of symbol rdma_resolve_addr
> ko2iblnd: Unknown symbol rdma_resolve_addr
> ko2iblnd: disagrees about version of symbol ib_reg_phys_mr
> ko2iblnd: Unknown symbol ib_reg_phys_mr
> ko2iblnd: disagrees about version of symbol ib_create_fmr_pool
> ko2iblnd: Unknown symbol ib_create_fmr_pool
> ko2iblnd: disagrees about version of symbol ib_dereg_mr
> ko2iblnd: Unknown symbol ib_dereg_mr
> ko2iblnd: disagrees about version of symbol rdma_reject
> ko2iblnd: Unknown symbol rdma_reject
> ko2iblnd: disagrees about version of symbol rdma_disconnect
> ko2iblnd: Unknown symbol rdma_disconnect
> ko2iblnd: disagrees about version of symbol rdma_resolve_route
> ko2iblnd: Unknown symbol rdma_resolve_route
> ko2iblnd: disagrees about version of symbol rdma_bind_addr
> ko2iblnd: Unknown symbol rdma_bind_addr
> ko2iblnd: disagrees about version of symbol rdma_create_qp
> ko2iblnd: Unknown symbol rdma_create_qp
> ko2iblnd: disagrees about version of symbol ib_destroy_cq
> ko2iblnd: Unknown symbol ib_destroy_cq
> ko2iblnd: disagrees about version of symbol rdma_create_id
> ko2iblnd: Unknown symbol rdma_create_id
> ko2iblnd: disagrees about version of symbol rdma_listen
> ko2iblnd: Unknown symbol rdma_listen
> ko2iblnd: disagrees about version of symbol rdma_destroy_qp
> ko2iblnd: Unknown symbol rdma_destroy_qp
> ko2iblnd: disagrees about version of symbol ib_query_device
> ko2iblnd: Unknown symbol ib_query_device
> ko2iblnd: disagrees about version of symbol ib_get_dma_mr
> ko2iblnd: Unknown symbol ib_get_dma_mr
> ko2iblnd: disagrees about version of symbol ib_alloc_pd
> ko2iblnd: Unknown symbol ib_alloc_pd
> ko2iblnd: disagrees about version of symbol rdma_connect
> ko2iblnd: Unknown symbol rdma_connect
> ko2iblnd: disagrees about version of symbol ib_modify_qp
> ko2iblnd: Unknown symbol ib_modify_qp
> ko2iblnd: disagrees about version of symbol rdma_destroy_id
> ko2iblnd: Unknown symbol rdma_destroy_id
> ko2iblnd: disagrees about version of symbol rdma_accept
> ko2iblnd: Unknown symbol rdma_accept
> ko2iblnd: disagrees about version of symbol ib_dealloc_pd
> ko2iblnd: Unknown symbol ib_dealloc_pd
> ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
> ko2iblnd: Unknown symbol ib_fmr_pool_map_phys
> LustreError: 4572:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load
LND
> o2ib, module ko2iblnd, rc=256
> LustreError: 4572:0:(events.c:729:ptlrpc_init_portals()) network
> initialisation failed
>
> At this point, i thought that is related to OFED compilation. I only
> compile OFED because I didn't found kernel-ib package at lustre
download
> site. I would like to know if someone had the same problem on their
> setup and if there are some workaround to get it working.
>
> In following I'll give more information about the lustre environment.
>
> client:
> CentOS 5.4 (2.6.18-164.11.1.el5)
> Lustre 1.8.2
> lustre-client-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
> lustre-client-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
> OFED 1.5
>
> Thanks in advance for the help and sorry for my bad english.
>
> Regards,
>
> Marco Gomes
> Systems/HPC-Cluster
> Numerical Offshore Tank
> (11) 3777-4142 #250
> (11) 3091-5350 #250
-------------- next part --------------
Lustre: OBD class driver, http://www.lustre.org/
Lustre:     Lustre Version: 1.8.2
Lustre:     Build Version: 1.8.2-20100122201357-PRISTINE-2.6.18-164.11.1.el5
ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap
ko2iblnd: Unknown symbol ib_fmr_pool_unmap
ko2iblnd: disagrees about version of symbol ib_create_cq
ko2iblnd: Unknown symbol ib_create_cq
ko2iblnd: disagrees about version of symbol rdma_resolve_addr
ko2iblnd: Unknown symbol rdma_resolve_addr
ko2iblnd: disagrees about version of symbol ib_reg_phys_mr
ko2iblnd: Unknown symbol ib_reg_phys_mr
ko2iblnd: disagrees about version of symbol ib_create_fmr_pool
ko2iblnd: Unknown symbol ib_create_fmr_pool
ko2iblnd: disagrees about version of symbol ib_dereg_mr
ko2iblnd: Unknown symbol ib_dereg_mr
ko2iblnd: disagrees about version of symbol rdma_reject
ko2iblnd: Unknown symbol rdma_reject
ko2iblnd: disagrees about version of symbol rdma_disconnect
ko2iblnd: Unknown symbol rdma_disconnect
ko2iblnd: disagrees about version of symbol rdma_resolve_route
ko2iblnd: Unknown symbol rdma_resolve_route
ko2iblnd: disagrees about version of symbol rdma_bind_addr
ko2iblnd: Unknown symbol rdma_bind_addr
ko2iblnd: disagrees about version of symbol rdma_create_qp
ko2iblnd: Unknown symbol rdma_create_qp
ko2iblnd: disagrees about version of symbol ib_destroy_cq
ko2iblnd: Unknown symbol ib_destroy_cq
ko2iblnd: disagrees about version of symbol rdma_create_id
ko2iblnd: Unknown symbol rdma_create_id
ko2iblnd: disagrees about version of symbol rdma_listen
ko2iblnd: Unknown symbol rdma_listen
ko2iblnd: disagrees about version of symbol rdma_destroy_qp
ko2iblnd: Unknown symbol rdma_destroy_qp
ko2iblnd: disagrees about version of symbol ib_query_device
ko2iblnd: Unknown symbol ib_query_device
ko2iblnd: disagrees about version of symbol ib_get_dma_mr
ko2iblnd: Unknown symbol ib_get_dma_mr
ko2iblnd: disagrees about version of symbol ib_alloc_pd
ko2iblnd: Unknown symbol ib_alloc_pd
ko2iblnd: disagrees about version of symbol rdma_connect
ko2iblnd: Unknown symbol rdma_connect
ko2iblnd: disagrees about version of symbol ib_modify_qp
ko2iblnd: Unknown symbol ib_modify_qp
ko2iblnd: disagrees about version of symbol rdma_destroy_id
ko2iblnd: Unknown symbol rdma_destroy_id
ko2iblnd: disagrees about version of symbol rdma_accept
ko2iblnd: Unknown symbol rdma_accept
ko2iblnd: disagrees about version of symbol ib_dealloc_pd
ko2iblnd: Unknown symbol ib_dealloc_pd
ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
ko2iblnd: Unknown symbol ib_fmr_pool_map_phys
LustreError: 4508:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
LustreError: 4508:0:(events.c:729:ptlrpc_init_portals()) network initialisation failed



More information about the lustre-discuss mailing list