<div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hello Martin,</div><div><br></div><div>Thank you for the hint.<br><br>I tried rebuilding using the suggested parameter, but the warnings persist.<br></div><div><br></div><div>Additionally, the system still fails to boot using the lustre kernel. <br></div><div><br></div><div>We noticed that Lustre's kernel image does not have the megaraid_sas module, which is used by the system to enable the Dell PERC H330 controller. This may be the cause of the boot failure.</div><div><br></div><div>

<span style="color:rgb(29,28,29);font-family:Monaco,Menlo,Consolas,"Courier New",monospace;font-size:12px;font-style:normal;font-variant-ligatures:none;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:pre-wrap;background-color:rgba(29,28,29,0.04);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">[root@mds2 ~]# lsinitrd /boot/initramfs-4.18.0-553.27.1.el8_lustre.x86_64.img | grep megaraid_sas
[root@mds2 ~]# </span> <br></div><div><br></div><div>However, this is not true for the kernel image installed via dnf.</div><div><br></div><div>

<span style="color:rgb(29,28,29);font-family:Monaco,Menlo,Consolas,"Courier New",monospace;font-size:12px;font-style:normal;font-variant-ligatures:none;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:pre-wrap;background-color:rgba(29,28,29,0.04);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">[root@mds2 ~]# lsinitrd /boot/initramfs-4.18.0-553.27.1.el8_10.x86_64.img | grep megaraid_sas
-rw-r--r--   1 root     root        72560 Jan 15  2024 usr/lib/modules/4.18.0-553.27.1.el8_10.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
[root@mds2 ~]# </span>

<br></div><div><br></div><div>I'm still here struggling to install it.</div><div><br></div><div><br clear="all"></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div></div>---<br><div><i><b>Carlos Adean</b></i></div><div><a href="https://www.linea.org.br" target="_blank">www.linea.org.br</a></div></div></div></div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Em qua., 23 de abr. de 2025 às 09:22, Audet, Martin <<a href="mailto:Martin.Audet@cnrc-nrc.gc.ca" target="_blank">Martin.Audet@cnrc-nrc.gc.ca</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>




<div dir="ltr">
<div id="m_8997662621852205494m_8633720231178926790divtagdefaultwrapper" style="font-size:12pt;color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif" dir="ltr">
<p>Hello,</p>
<p><br>
</p>
<p>I think I had a similar problem a long time ago and it was solved by adding the "--kmp" option to  "<span>mlnx_add_kernel_support.sh" script when compiling MOFED RPMs. Without this option, the MOFED RPM compilation complete without problems, the same thing
 when compiling Lustre RPMs but later, when installing Lustre RPMs, we get a bunch of problems related to symbols.</span></p>
<p><span><br>
</span></p>
<p><span>Here is how I compile the MOFED RPMs (uning the root account):</span></p>
<p><span><br>
</span></p>
<blockquote style="margin:0px 0px 0px 40px;border:medium;padding:0px">
<p><span><span style="font-family:Consolas,Courier,monospace"># mount_dir is the temporary mount directory</span></span></p>
<p><span><span style="font-family:Consolas,Courier,monospace"># ofed_iso  is the MOFED .iso file</span></span></p>
<p><span><span style="font-family:Consolas,Courier,monospace">#<br>
mkdir -p -- $mount_dir</span><br>
</span></p>
<p><span><span><span style="font-family:Consolas,Courier,monospace">mount -o ro,loop $ofed_iso $mount_dir</span><br>
</span></span></p>
<p><span><span><span><span style="font-family:Consolas,Courier,monospace">$mount_dir/mlnx_add_kernel_support.sh -y --make-tgz --kmp -k $(uname -r) -m $mount_dir</span><br>
</span></span></span></p>
<p><span><span><span><span style="font-family:Consolas,Courier,monospace">#</span></span></span></span></p>
<p><span><span><span><span style="font-family:Consolas,Courier,monospace"># The compiled RPMs are now under /tmp</span></span></span></span></p>
<p><span><span><span><span style="font-family:Consolas,Courier,monospace"># ex: /tmp/<span>MLNX_OFED_LINUX-<span>24.10-2.1.8.0-rhel8.10.x86_64</span></span>-ext.tgz</span></span></span></span></p>
<p><span><span><br>
</span></span></p>
</blockquote>
<p><span style="font-size:12pt">It seems that the pre-compiled RPMs distributed by Mellanox/NVIDIA are always generated using the --kmp but when using mlnx_add_kernel_support.sh, this option must be explicitly specified. In addition, it seems that with the
 newer DOCA OFED, the <span style="font-family:Calibri,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">using script equivatent to mlnx_add_kernel_support.sh
 always add --kmp option on RHEL and similar distributions.</span></span></p>
<p><span style="font-size:12pt"><span style="font-family:Calibri,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px"><br>
</span></span></p>
<p>I hope it helps,</p>
<p><br>
</p>
<p>Martin</p>
<div style="color:rgb(0,0,0)">
<hr style="display:inline-block;width:98%">
<div id="m_8997662621852205494m_8633720231178926790divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> lustre-discuss <<a href="mailto:lustre-discuss-bounces@lists.lustre.org" target="_blank">lustre-discuss-bounces@lists.lustre.org</a>> on behalf of Carlos Adean via lustre-discuss <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>><br>
<b>Sent:</b> April 22, 2025 11:09 PM<br>
<b>To:</b> <a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a><br>
<b>Cc:</b> Eloir Troyack<br>
<b>Subject:</b> EXT: [lustre-discuss] Installing lustre 2.15.6 server on rhel-8.10 fails</font>
<div> </div>
</div>
<div>
<div><span style="font-weight:bold">***Attention*** This email originated from outside of the NRC. ***Attention*** Ce courriel provient de l'extérieur du CNRC.</span></div>
<div><br>
</div>
<div dir="ltr">
<div>Hello all,</div>
<div><br>
</div>
<div>My current version of RHEL 8 is Rocky Linux 8.10, running the kernel 4.18.0-553.27.1.el8_10. I also have the OFED drivers version 24.10-2.1.8.0 installed for the InfiniBand interface (I tried without OFED before).</div>
<div></div>
<div><span style="font-family:monospace"><br>
</span></div>
<div>The installation of "kmod-lustre-2.15.6-1.el8" and "kmod-lustre-osd-ldiskfs-2.15.6-1" always shows these warning messages below.</div>
<div><span style="font-family:monospace"><br>
</span></div>
<div><span style="font-family:monospace"># dnf --nogpgcheck --enablerepo=lustre-server install kmod-lustre kmod-lustre-osd-ldiskfs lustre-osd-ldiskfs-mount lustre lustre-resource-agents</span></div>
<span style="font-family:monospace">[...]</span>
<div><span style="font-family:monospace">depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol __ib_alloc_pd<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_resolve_addr<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_dereg_mr_user<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_reject<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_disconnect<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol __rdma_create_kernel_id<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_register_event_handler<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_resolve_route<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_unregister_event_handler<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_bind_addr<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_create_qp<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_map_mr_sg<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_query_port<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_notify<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_listen<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_destroy_qp<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol __ib_create_cq<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_alloc_mr<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_connect_locked<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_set_reuseaddr<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_destroy_cq_user<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_modify_qp<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_dma_virt_map_sg<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_destroy_id<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol rdma_accept<br>
depmod: WARNING: /lib/modules/4.18.0-553.27.1.el8_lustre.x86_64/extra/lustre/net/ko2iblnd.ko needs unknown symbol ib_dealloc_pd_user</span></div>
<div><span style="font-family:monospace">[...]<br>
Installed:<br>
  kernel-core-4.18.0-553.27.1.el8_lustre.x86_64    kmod-lustre-2.15.6-1.el8.x86_64    kmod-lustre-osd-ldiskfs-2.15.6-1.el8.x86_64    lustre-2.15.6-1.el8.x86_64    lustre-osd-ldiskfs-mount-2.15.6-1.el8.x86_64  
<br>
  lustre-resource-agents-2.15.6-1.el8.x86_64   <br>
</span></div>
<div><span style="font-family:monospace"><br>
</span></div>
<div><span style="font-family:monospace">Completed!</span></div>
<div><br>
</div>
<div><br>
</div>
<div>After rebooting, the server drops into an emergency shell because it can't find the LVM devices. This issue only occurs with the Lustre kernel, other installed kernels boot normally.</div>
<div></div>
<div><br>
</div>
<div></div>
<div><br>
</div>
<div>Any hints on how to proceed?</div>
<br>
<div><br clear="all">
</div>
<div>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div></div>
---<br>
<div><i><b>Carlos Adean</b></i></div>
<div><a href="https://www.linea.org.br" id="m_8997662621852205494m_8633720231178926790LPlnk223764" target="_blank">www.linea.org.br</a></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>

</div></blockquote></div>