<div dir="ltr"><div dir="ltr">Thanks for the response, I've used just defaults on my initial attempt, but yes I was using o2ib as this is implemented in all the physical servers. If I need to use a different module as you indicate, how would I do that? via /etc/modprobe.d/lnet.conf or in another file?<br clear="all"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><br></div><div>Regards</div><div><br></div><div>Sid Young</div><div>W: <a href="https://off-grid-engineering.com" target="_blank">https://off-grid-engineering.com</a></div><div><br></div></div></div></div></div></div></div></div></div></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of lustre-discuss digest..."<br>
Today's Topics:<br>
<br>
1. Re: Lustre 2.15.5 in a Virtual Machine (Michael DiDomenico)<br>
<br><br><br>---------- Forwarded message ----------<br>From: Michael DiDomenico <<a href="mailto:mdidomenico4@gmail.com" target="_blank">mdidomenico4@gmail.com</a>><br>To: <br>Cc: lustre-discuss <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>><br>Bcc: <br>Date: Fri, 25 Oct 2024 12:47:07 -0400<br>Subject: Re: [lustre-discuss] Lustre 2.15.5 in a Virtual Machine<br>lustre in a vm certainly works as i have many running under vmware and<br>
mounting lustre<br>
<br>
but i'm a little confused on your message. are you trying to bind the<br>
lustre client via infiniband or tcp/ip? if the later (assumed based<br>
on the ens nic prefix), you need to use the ksocklnd not the kiblnd<br>
module<br>
<br>
<br>
On Thu, Oct 24, 2024 at 3:17 AM Sid Young <<a href="mailto:sid.young@gmail.com" target="_blank">sid.young@gmail.com</a>> wrote:<br>
><br>
> G'Day all,<br>
><br>
> I'm trying to get lustre to bind to a 100G Mellanox card shared between VM's but it fails with the following errors in dmeg:<br>
><br>
> [ 406.474952] Lustre: Lustre: Build Version: 2.15.5<br>
> [ 406.604652] LNetError: 92384:0:(o2iblnd.c:2838:kiblnd_dev_failover()) Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19<br>
> [ 406.604704] LNetError: 92384:0:(o2iblnd.c:3355:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19<br>
> [ 407.655888] LNetError: 105-4: Error -100 starting up LNI o2ib<br>
> [ 407.656729] LustreError: 92384:0:(events.c:639:ptlrpc_init_portals()) network initialisation failed<br>
> [ 559.741846] LNetError: 92993:0:(lib-move.c:2255:lnet_handle_find_routed_path()) peer 10.140.93.42@o2ib has no available nets<br>
> [ 594.480161] LNetError: 93225:0:(o2iblnd.c:2838:kiblnd_dev_failover()) Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19<br>
> [ 594.480213] LNetError: 93225:0:(o2iblnd.c:3355:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19<br>
> [ 595.498493] LNetError: 105-4: Error -100 starting up LNI o2ib<br>
> [ 707.825127] LNetError: 93691:0:(o2iblnd.c:2838:kiblnd_dev_failover()) Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19<br>
> [ 707.825182] LNetError: 93691:0:(o2iblnd.c:3355:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19<br>
> [ 708.843933] LNetError: 105-4: Error -100 starting up LNI o2ib<br>
> [ 789.779769] LNetError: 93930:0:(o2iblnd.c:2838:kiblnd_dev_failover()) Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19<br>
> [ 789.779820] LNetError: 93930:0:(o2iblnd.c:3355:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19<br>
> [ 790.828974] LNetError: 105-4: Error -100 starting up LNI o2ib<br>
> [root@hpc-vm-02 2.15.5]#<br>
><br>
> The VM has two network interfaces ens192 and ens224 both are operational with TCP traffic.<br>
><br>
> /etc/modprobe.d/lnet.conf<br>
> options lnet networks="o2ib(ens224) 10.140.93.*"<br>
><br>
> [root@hpc-vm-02 2.15.5]# lnetctl net add --net o2ib --if ens224<br>
> add:<br>
> - net:<br>
> errno: -100<br>
> descr: "cannot add network: Network is down"<br>
> [root@hpc-vm-02 2.15.5]#<br>
><br>
><br>
> Any ideas where I might look?<br>
> Are virtual machines even supported with Lustre<br>
> OS is VMWare 7U3 on HP DL385 with 256 cores and 512GB RAM.<br><br>
</blockquote></div></div>