[lustre-discuss] Issues compiling Lustre client with Intel IB drivers
Rafa Griman
rafagriman at gmail.com
Mon Jul 6 00:45:07 PDT 2015
Hi all :)
I'm having issues compiling the Lustre client with the IB drivers from
Intel. Some details:
CentOS 6.5 x86_64 with kernel version (uname -r):
2.6.32-431.29.2.el6.x86_64.
Lustre version 2.4.3 and 2.7.0
QLogic IB HCAs, output from lspci:
15:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand
HCA (rev 02)
1f:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand
HCA (rev 02)
Steps I followed:
1.- install Intel IB drivers IntelIB-IFS.RHEL6-x86_64.7.3.0.0.26.
These are the drivers we have here already tested, know they work, ...
I use the INSTALL script that comes with these drivers and install
almost everything (I leave out the MVAPICH and OpenMPI), this is what
gets installed:
OFED IB Stack 3.5.2.34
True Scale HCA Libs 3.3.0.0.9703
OFED mlx4 Driver 3.5.2.34
IB Tools 7.3.0.0.26
OFED IB Development 3.5.2.34
FastFabric 7.3.0.0.26
OFED IP over IB 3.5.2.34
IFS FM 7.3.0.0.15
SHMEM 3.3-9703.1177_rhel6_qlca)
OFED uDAPL 3.5.2.34
OFED SRP 3.5.2.34
And these are the autostart options:
OFED IB Stack (openibd) [Enable ]
OFED mlx4 Driver (openibd) [Enable ]
IB Port Monitor (iba_mon) [Enable ]
S20 Port Tuner (s20tune) [Disable]
Distributed SA (dist_sa) [Disable]
OFED IP over IB (openibd) [Enable ]
OFED SDP () [Enable ]
IFS FM (ifs_fm) [Enable ]
OFED RDS (openibd) [Enable ]
OFED SRP (openibd) [Enable ]
Installation says I don't need firmware updates: "Firmware is not
required for the Intel HCA(s) in this system." So I reboot (as
recommended) and I get IB up and running (right now I'm only usin 1 IB
port):
# ip a s
ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast
state UP qlen 256
link/infiniband
80:00:00:03:fe:80:00:00:00:00:00:00:00:11:75:00:00:76:e0:10 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet 10.34.200.60/16 brd 10.34.255.255 scope global ib0
inet6 fe80::211:7500:76:e010/64 scope link
valid_lft forever preferred_lft forever
# ping 10.34.200.59
PING 10.34.200.59 (10.34.200.59) 56(84) bytes of data.
64 bytes from 10.34.200.59: icmp_seq=1 ttl=64 time=3.09 ms
64 bytes from 10.34.200.59: icmp_seq=2 ttl=64 time=0.162 ms
64 bytes from 10.34.200.59: icmp_seq=3 ttl=64 time=0.216 ms
# ibstatus
Infiniband device 'qib0' port 1 status:
default gid: fe80:0000:0000:0000:0011:7500:0076:e010
base lid: 0x14
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand
# ibswitches
Switch : 0x00066a00e3005724 ports 36 "QLogic 12200
GUID=0x00066a00e3005724" enhanced port 0 lid 1 lmc 0
I've also ssh'ed into other nodes via IB ... so I guess it's working ;)
This driver installation creates the following directories in /usr/src:
compat-rdma
compat-rdma-3.5
openib -> compat-rdma
2.- Lustre client compilation. I run:
./configure --with-linux=/lib/modules/`uname -r`/source
--disable-server --with-o2ib=/usr/src/compat-rdma
and get this error:
configure: error: can't compile with OpenIB gen2 headers under
/usr/src/compat-rdma
Any hints?
I've looked around on the web and read some posts about upgrading to
later versions. So I tried using 2.7.0, ./configure seems to work, but
when I "make rpms", I get this:
CC: gcc
LD: /usr/bin/ld -m elf_x86_64
CPPFLAGS: -include /root/rpmbuild/BUILD/lustre-2.7.0/config.h
-I/root/rpmbuild/BUILD/lustre-2.7.0/libcfs/include
-I/root/rpmbuild/BUILD/lustre-2.7.0/lnet/include
-I/root/rpmbuild/BUILD/lustre-2.7.0/lustre/include
LLCPPFLAGS: -D_LARGEFILE64_SOURCE=1
CFLAGS: -g -O2 -Werror -Werror
EXTRA_KCFLAGS: -include /root/rpmbuild/BUILD/lustre-2.7.0/config.h -g
-I/root/rpmbuild/BUILD/lustre-2.7.0/libcfs/include
-I/root/rpmbuild/BUILD/lustre-2.7.0/lnet/include
-I/root/rpmbuild/BUILD/lustre-2.7.0/lustre/include
LLCFLAGS: -g -Wall -fPIC -D_GNU_SOURCE
Type 'make' to build Lustre.
+ make -j8 -s
Making all in .
In file included from
/root/rpmbuild/BUILD/lustre-2.7.0/libcfs/include/libcfs/libcfs.h:73,
from
/root/rpmbuild/BUILD/lustre-2.7.0/lnet/klnds/o2iblnd/o2iblnd.h:80,
from
/root/rpmbuild/BUILD/lustre-2.7.0/lnet/klnds/o2iblnd/o2iblnd.c:42:
/root/rpmbuild/BUILD/lustre-2.7.0/libcfs/include/libcfs/curproc.h:98:
error: conflicting types for 'uid_eq'
/usr/src/compat-rdma/include/linux/compat-3.5.h:258: note: previous
definition of 'uid_eq' was here
make[8]: *** [/root/rpmbuild/BUILD/lustre-2.7.0/lnet/klnds/o2iblnd/o2iblnd.o]
Error 1
make[7]: *** [/root/rpmbuild/BUILD/lustre-2.7.0/lnet/klnds/o2iblnd] Error 2
make[6]: *** [/root/rpmbuild/BUILD/lustre-2.7.0/lnet/klnds] Error 2
make[5]: *** [/root/rpmbuild/BUILD/lustre-2.7.0/lnet] Error 2
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [_module_/root/rpmbuild/BUILD/lustre-2.7.0] Error 2
make[3]: *** [modules] Error 2
make[2]: *** [all-recursive] Error 1
make[1]: *** [all] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.wZrCy1 (%build)
Seems to me there's some type of conflict with the Intel IB drivers
I'm using and the Lustre code. Anyone run into this before? Any fixes,
recommendations, tips, ...? What driver version do you recommend?
If you need any more details or info, please let me know.
Thanks !!!
Rafa
More information about the lustre-discuss
mailing list