[lustre-discuss] MOFED & Lustre 2.14.51 - install fails with dependency failure related to ksym/MOFED

Laura Hild lsh at jlab.org
Fri May 21 12:26:56 PDT 2021


Hi Pinkesh-

Not sure how relevant this is for server builds, but when I've built Lustre clients against MOFED, I've had to use mlnx_add_kernel_support.sh --kmp rather than use the "kver" packages, in order to avoid ksym dependency errors when installing.

-Laura


________________________________
Od: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> v imenu Pinkesh Valdria via lustre-discuss <lustre-discuss at lists.lustre.org>
Poslano: petek, 21. maj 2021 06:05
Za: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Zadeva: [EXTERNAL] [lustre-discuss] MOFED & Lustre 2.14.51 - install fails with dependency failure related to ksym/MOFED


Sorry for a long email,  wanted to make sure I share enough details for community to provide guidance.   I am building all lustre packages for Oracle Linux7.9-RHCK and MOFED: 5.3-1.0.0.1 using steps described here:  https://wiki.lustre.org/Compiling_Lustre<https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.lustre.org_Compiling-5FLustre&d=DwMGaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=897kjkV-MEeU1IVizIfc5Q&m=a9Zeh-RtY7At1rBZMxhqFZOu35S4bJ_sWS3OZCJYFEI&s=NAKq8izFpe0rN7fcMOl9nwwC2W3UbsWmCdRc20qGZuc&e=>



Oracle Linux 7.9 – Kernel:  3.10.0-1160.15.2.el7.x86_64



I was able to create the below RPM packages successfully using a node which has same OS and kernel version and MOFED version and MLNX CX-5 card,  but when I try to install them on my lustre nodes, I get a dependency failure related to ksym/MOFED packages (more details below).



  1.  LDISKFS and Patching the Linux Kernel
  2.  MOFED rpms
  3.  Lustre server rpms
  4.  Lustre client rpms



After all RPMs were created, I created a local repo and added to all Lustre nodes:



cat > /etc/yum.repos.d/lustre.repo << EOF

[hpddLustreserver]

name=OL-Lustre-Server

baseurl=file:///home/opc/releases/lustre-server/

gpgcheck=0



[e2fsprogs]

name=CentOS- - Ldiskfs

baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/

gpgcheck=0



[hpddLustreclient]

name=OL-Lustre-Client

baseurl=file:///home/opc/releases/lustre-client/

gpgcheck=0



[LustreKernel]

name=LustreKernel

baseurl=file:///home/opc/releases/lustre-kernel/

gpgcheck=0



[MOFED]

name=MOFED

baseurl=file:///home/opc/releases/mofed/

gpgcheck=0

EOF





MOFED is installed and configured on those nodes and was able to validate using IMB-MPI1 pingpong test.

show_gids

mlx5_0 1              2              0000:0000:0000:0000:0000:ffff:c0a8:a985            192.168.169.133              v1           enp94s0f0





Dependency failure :   On OSS nodes, I ran the below to install all Lustre packages:

sudo yum install lustre-tests -y



[opc at inst-dwnv3-topical-goblin ~]$ sudo yum install -y lustre-tests

Loaded plugins: langpacks, ulninfo

LustreKernel                                                                                                                                                                                                                                                             | 2.9 kB  00:00:00

MOFED                                                                                                                                                                                                                                                                    | 2.9 kB  00:00:00

e2fsprogs                                                                                                                                                                                                                                                                | 2.9 kB  00:00:00

hpddLustreclient                                                                                                                                                                                                                                                         | 2.9 kB  00:00:00

hpddLustreserver                                                                                                                                                                                                                                                         | 2.9 kB  00:00:00

MOFED/primary_db

--> Running transaction check

---> Package lustre-tests.x86_64 0:2.14.51-1.el7 will be installed

--> Processing Dependency: lustre-devel = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64

--> Processing Dependency: kmod-lustre-tests = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64

--> Processing Dependency: kmod-lustre = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64

--> Processing Dependency: lustre-iokit for package: lustre-tests-2.14.51-1.el7.x86_64

--> Processing Dependency: liblustreapi.so.1()(64bit) for package: lustre-tests-2.14.51-1.el7.x86_64

--> Processing Dependency: liblnetconfig.so.4()(64bit) for package: lustre-tests-2.14.51-1.el7.x86_64

--> Running transaction check

---> Package kmod-lustre.x86_64 0:2.14.51-1.el7 will be installed

…..

….

---> Package libcom_err.x86_64 0:1.45.4-3.0.5.el7 will be updated

---> Package libcom_err.x86_64 0:1.46.2.wc1-0.el7 will be an update

---> Package libss.x86_64 0:1.45.4-3.0.5.el7 will be updated

---> Package libss.x86_64 0:1.46.2.wc1-0.el7 will be an update

--> Finished Dependency Resolution

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(ib_map_mr_sg) = 0xcd1ffb73

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(rdma_resolve_route) = 0xc2064869

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(ib_unregister_event_handler) = 0xc58881d0

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(ib_query_port) = 0x6889b87f

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(rdma_disconnect) = 0x49262e62

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(rdma_connect_locked) = 0x7eaa4a8a

….

…. All ib/rdma related errors similar to above for kmod-lustre.x

….

Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver)

           Requires: ksym(ib_destroy_cq_user) = 0x5671830b

You could try using --skip-broken to work around the problem

** Found 3 pre-existing rpmdb problem(s), 'yum check' output follows:

oracle-cloud-agent-1.11.1-5104.el7.x86_64 is a duplicate with oracle-cloud-agent-1.8.2-3843.el7.x86_64

rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-3.0)

rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-route-3.0)

[opc at inst-dwnv3-topical-goblin ~]$











RPMS from:  LDISKFS and Patching the Linux Kernel

ls lustre-kernel/RPMS/

  *   bpftool-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   bpftool-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-debug-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-debug-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-debug-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-debuginfo-common-x86_64-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-headers-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-tools-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-tools-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-tools-libs-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   kernel-tools-libs-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   python-perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm
  *   python-perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm





MOFED rpms



Steps followed:

Download from MLNX site the source:  MLNX_OFED_SRC-5.3-1.0.0.1.tgz

tar -zvxf $HOME/MLNX_OFED_SRC-5.3-1.0.0.1.tgz

cd MLNX_OFED_SRC-5.3-1.0.0.1/

./install.pl --build-only --kernel-only \

--kernel 3.10.0-1160.15.2.el7.x86_64 \

--kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7.x86_64



cp RPMS/*/*/*.rpm  $HOME/releases/mofed



Question:  I am passing regular kernel (3.10.0-1160.15.2.el7.x86_64) and its source (not Lustre patched kernel)  as input to MOFED install command above,  I hope that is correct.





  *   kernel-mft-4.16.3-12.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm
  *   knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.ol7u9.x86_64.rpm
  *   knem-modules-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm
  *   mlnx-nfsrdma-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm
  *   mlnx-nfsrdma-debuginfo-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm
  *   mlnx-ofa_kernel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm
  *   mlnx-ofa_kernel-debuginfo-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm
  *   mlnx-ofa_kernel-devel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm
  *   mlnx-ofa_kernel-modules-5.3-OFED.5.3.1.0.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm
  *   ofed-scripts-5.3-OFED.5.3.1.0.0.x86_64.rpm





Lustre Server packages



./configure --enable-server \

--with-linux=/usr/src/kernels/*_lustre.x86_64 \

--with-o2ib=/usr/src/ofa_kernel/default



make rpms





  *   kmod-lustre-2.14.51-1.el7.x86_64.rpm
  *   kmod-lustre-osd-ldiskfs-2.14.51-1.el7.x86_64.rpm
  *   kmod-lustre-tests-2.14.51-1.el7.x86_64.rpm
  *   lustre-2.14.51-1.el7.x86_64.rpm
  *   lustre-2.14.51-1.src.rpm
  *   lustre-debuginfo-2.14.51-1.el7.x86_64.rpm
  *   lustre-devel-2.14.51-1.el7.x86_64.rpm
  *   lustre-iokit-2.14.51-1.el7.x86_64.rpm
  *   lustre-osd-ldiskfs-mount-2.14.51-1.el7.x86_64.rpm
  *   lustre-resource-agents-2.14.51-1.el7.x86_64.rpm
  *   lustre-tests-2.14.51-1.el7.x86_64.rpm





Lustre Client packages



./configure --disable-server --enable-client \

--with-linux=/usr/src/kernels/*_lustre.x86_64 \

--with-o2ib=/usr/src/ofa_kernel/default



make rpms



  *   kmod-lustre-client-2.14.51-1.el7.x86_64.rpm
  *   kmod-lustre-client-tests-2.14.51-1.el7.x86_64.rpm
  *   lustre-2.14.51-1.src.rpm
  *   lustre-client-2.14.51-1.el7.x86_64.rpm
  *   lustre-client-debuginfo-2.14.51-1.el7.x86_64.rpm
  *   lustre-client-devel-2.14.51-1.el7.x86_64.rpm
  *   lustre-client-tests-2.14.51-1.el7.x86_64.rpm
  *   lustre-iokit-2.14.51-1.el7.x86_64.rpm







Thanks,

Pinkesh Valdria

Principal Solutions Architect – HPC

Oracle Cloud Infrastructure

+65-8932-3639 (m) - Singapore

+1-425-205-7834 (m) - USA


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210521/53ad7295/attachment-0001.html>


More information about the lustre-discuss mailing list