[lustre-discuss] lustre-discuss Digest, Vol 223, Issue 7

Sid Young sid.young at gmail.com
Sun Oct 27 04:46:54 PDT 2024


Thanks for the response, I've used just defaults on my initial attempt, but
yes I was using o2ib as this is implemented in all the physical servers. If
I need to use a different module as you indicate, how would I do that? via
/etc/modprobe.d/lnet.conf or in another file?

Regards

Sid Young
W: https://off-grid-engineering.com


>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lustre-discuss digest..."
> Today's Topics:
>
>    1. Re: Lustre 2.15.5 in a Virtual Machine (Michael DiDomenico)
>
>
>
> ---------- Forwarded message ----------
> From: Michael DiDomenico <mdidomenico4 at gmail.com>
> To:
> Cc: lustre-discuss <lustre-discuss at lists.lustre.org>
> Bcc:
> Date: Fri, 25 Oct 2024 12:47:07 -0400
> Subject: Re: [lustre-discuss] Lustre 2.15.5 in a Virtual Machine
> lustre in a vm certainly works as i have many running under vmware and
> mounting lustre
>
> but i'm a little confused on your message.  are you trying to bind the
> lustre client via infiniband or tcp/ip?  if the later (assumed based
> on the ens nic prefix), you need to use the ksocklnd not the kiblnd
> module
>
>
> On Thu, Oct 24, 2024 at 3:17 AM Sid Young <sid.young at gmail.com> wrote:
> >
> > G'Day all,
> >
> > I'm trying to get lustre to bind to a 100G Mellanox card shared between
> VM's but it fails with the following errors in dmeg:
> >
> > [  406.474952] Lustre: Lustre: Build Version: 2.15.5
> > [  406.604652] LNetError: 92384:0:(o2iblnd.c:2838:kiblnd_dev_failover())
> Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19
> > [  406.604704] LNetError: 92384:0:(o2iblnd.c:3355:kiblnd_startup())
> ko2iblnd: Can't initialize device: rc = -19
> > [  407.655888] LNetError: 105-4: Error -100 starting up LNI o2ib
> > [  407.656729] LustreError: 92384:0:(events.c:639:ptlrpc_init_portals())
> network initialisation failed
> > [  559.741846] LNetError:
> 92993:0:(lib-move.c:2255:lnet_handle_find_routed_path()) peer
> 10.140.93.42 at o2ib has no available nets
> > [  594.480161] LNetError: 93225:0:(o2iblnd.c:2838:kiblnd_dev_failover())
> Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19
> > [  594.480213] LNetError: 93225:0:(o2iblnd.c:3355:kiblnd_startup())
> ko2iblnd: Can't initialize device: rc = -19
> > [  595.498493] LNetError: 105-4: Error -100 starting up LNI o2ib
> > [  707.825127] LNetError: 93691:0:(o2iblnd.c:2838:kiblnd_dev_failover())
> Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19
> > [  707.825182] LNetError: 93691:0:(o2iblnd.c:3355:kiblnd_startup())
> ko2iblnd: Can't initialize device: rc = -19
> > [  708.843933] LNetError: 105-4: Error -100 starting up LNI o2ib
> > [  789.779769] LNetError: 93930:0:(o2iblnd.c:2838:kiblnd_dev_failover())
> Failed to bind ens224:10.140.93.72 to device(0000000000000000): -19
> > [  789.779820] LNetError: 93930:0:(o2iblnd.c:3355:kiblnd_startup())
> ko2iblnd: Can't initialize device: rc = -19
> > [  790.828974] LNetError: 105-4: Error -100 starting up LNI o2ib
> > [root at hpc-vm-02 2.15.5]#
> >
> > The VM has two network interfaces ens192 and ens224 both are operational
> with TCP traffic.
> >
> > /etc/modprobe.d/lnet.conf
> > options lnet networks="o2ib(ens224) 10.140.93.*"
> >
> > [root at hpc-vm-02 2.15.5]# lnetctl net add --net o2ib --if ens224
> > add:
> >     - net:
> >           errno: -100
> >           descr: "cannot add network: Network is down"
> > [root at hpc-vm-02 2.15.5]#
> >
> >
> > Any ideas where I might look?
> > Are virtual machines even supported with Lustre
> > OS is VMWare 7U3 on HP DL385 with 256 cores and 512GB RAM.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20241027/831c6c6b/attachment.htm>


More information about the lustre-discuss mailing list