[Lustre-discuss] Installing Lustre 1.8.2 on CentOS 5.4 with OFED 1.4.2

Michael Mayer mmayer at hpce.nec.com
Sun Mar 21 11:00:46 PDT 2010


Hi Marco,

OFED 1.4.2 and the RHEL5.4 kernel do not work because OFED 1.4.2 lacks 
official RHEL5.4 support (I think OFED 1.4.2 was even released before 
RHEL5.4 so it only has backports up to and including RHEL 5.3. The 
RHEL5.4 kernels do not differ much from the RHEL5.3 but became a bit 
more upstream so a couple of the RHEL5.3 backports of OFED 1.4.2 are not 
needed any longer.

We had the same problem here and upon inspection of the error messages 
it turns out that the errors in the OFED 1.4.2 compilation can be fixed 
in a rather simple way: I have created an OFED patch and a modified spec 
file (attached).

So, please install the default ofa_kernel source rpm in your system, 
replace the spec file and copy the patch file to the SOURCES directory, 
then run a "rpmbuild -bs ofa_kernel.spec" and finally replace the 
default source rpm with the new ofa_kernel source rpm.

After that, OFED compilation should work without errors.

Cheers,

Michael.

On 20/03/10 00:31, Marco Aurelio L Gomes wrote:
> Hi all,
>
> I'm trying to setup my Lustre environment using CentOS 5.4
> (kernel-2.6.18-164.11.1.el5-x86_64 from updates repository) Lustre 1.8.2
> and OFED 1.4.2. I saw at Lustre Matrix Support that 1.8.2 works at most
> with OFED 1.4.2, but when I tried to compile with this release, I saw a
> lot of errors complaining about variables redefinition at ofa_kernel
> compiling process (see attached ofa_kernel_rpmbuild.log. but i think
> this is not related to this list). In this case, i tried to compile with
> OFED 1.5, the latest stable release of OFED, that compile fine; but when
> i boot a client, and run:
>
> modprobe lustre
>
> i got the following errors:
>
> [root at masternode1 ~]# modprobe lustre
> WARNING: Error inserting osc
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/osc.ko): Input/output
> error
> WARNING: Error inserting mdc
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/mdc.ko): Input/output
> error
> WARNING: Error inserting lov
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lov.ko): Input/output
> error
> FATAL: Error inserting lustre
> (/lib/modules/2.6.18-164.11.1.el5/kernel/fs/lustre/lustre.ko):
> Input/output error
>
> and at dmesg:
>
> Lustre: OBD class driver, http://www.lustre.org/
> Lustre:     Lustre Version: 1.8.2
> Lustre:     Build Version:
> 1.8.2-20100122201357-PRISTINE-2.6.18-164.11.1.el5
> ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap
> ko2iblnd: Unknown symbol ib_fmr_pool_unmap
> ko2iblnd: disagrees about version of symbol ib_create_cq
> ko2iblnd: Unknown symbol ib_create_cq
> ko2iblnd: disagrees about version of symbol rdma_resolve_addr
> ko2iblnd: Unknown symbol rdma_resolve_addr
> ko2iblnd: disagrees about version of symbol ib_reg_phys_mr
> ko2iblnd: Unknown symbol ib_reg_phys_mr
> ko2iblnd: disagrees about version of symbol ib_create_fmr_pool
> ko2iblnd: Unknown symbol ib_create_fmr_pool
> ko2iblnd: disagrees about version of symbol ib_dereg_mr
> ko2iblnd: Unknown symbol ib_dereg_mr
> ko2iblnd: disagrees about version of symbol rdma_reject
> ko2iblnd: Unknown symbol rdma_reject
> ko2iblnd: disagrees about version of symbol rdma_disconnect
> ko2iblnd: Unknown symbol rdma_disconnect
> ko2iblnd: disagrees about version of symbol rdma_resolve_route
> ko2iblnd: Unknown symbol rdma_resolve_route
> ko2iblnd: disagrees about version of symbol rdma_bind_addr
> ko2iblnd: Unknown symbol rdma_bind_addr
> ko2iblnd: disagrees about version of symbol rdma_create_qp
> ko2iblnd: Unknown symbol rdma_create_qp
> ko2iblnd: disagrees about version of symbol ib_destroy_cq
> ko2iblnd: Unknown symbol ib_destroy_cq
> ko2iblnd: disagrees about version of symbol rdma_create_id
> ko2iblnd: Unknown symbol rdma_create_id
> ko2iblnd: disagrees about version of symbol rdma_listen
> ko2iblnd: Unknown symbol rdma_listen
> ko2iblnd: disagrees about version of symbol rdma_destroy_qp
> ko2iblnd: Unknown symbol rdma_destroy_qp
> ko2iblnd: disagrees about version of symbol ib_query_device
> ko2iblnd: Unknown symbol ib_query_device
> ko2iblnd: disagrees about version of symbol ib_get_dma_mr
> ko2iblnd: Unknown symbol ib_get_dma_mr
> ko2iblnd: disagrees about version of symbol ib_alloc_pd
> ko2iblnd: Unknown symbol ib_alloc_pd
> ko2iblnd: disagrees about version of symbol rdma_connect
> ko2iblnd: Unknown symbol rdma_connect
> ko2iblnd: disagrees about version of symbol ib_modify_qp
> ko2iblnd: Unknown symbol ib_modify_qp
> ko2iblnd: disagrees about version of symbol rdma_destroy_id
> ko2iblnd: Unknown symbol rdma_destroy_id
> ko2iblnd: disagrees about version of symbol rdma_accept
> ko2iblnd: Unknown symbol rdma_accept
> ko2iblnd: disagrees about version of symbol ib_dealloc_pd
> ko2iblnd: Unknown symbol ib_dealloc_pd
> ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
> ko2iblnd: Unknown symbol ib_fmr_pool_map_phys
> LustreError: 4572:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load LND
> o2ib, module ko2iblnd, rc=256
> LustreError: 4572:0:(events.c:729:ptlrpc_init_portals()) network
> initialisation failed
>
> At this point, i thought that is related to OFED compilation. I only
> compile OFED because I didn't found kernel-ib package at lustre download
> site. I would like to know if someone had the same problem on their
> setup and if there are some workaround to get it working.
>
> In following I'll give more information about the lustre environment.
>
> client:
> CentOS 5.4 (2.6.18-164.11.1.el5)
> Lustre 1.8.2
> lustre-client-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
> lustre-client-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
> OFED 1.5
>
> Thanks in advance for the help and sorry for my bad english.
>
> Regards,
>
> Marco Gomes
> Systems/HPC-Cluster
> Numerical Offshore Tank
> (11) 3777-4142 #250
> (11) 3091-5350 #250
>
>    
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>    

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100321/6c1f24a4/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ofa_kernel.spec
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100321/6c1f24a4/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rhel-5.4.patch
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100321/6c1f24a4/attachment.asc>


More information about the lustre-discuss mailing list