[lustre-discuss] error while configuring lnet

Mannthey, Keith keith.mannthey at intel.com
Sun Nov 12 08:36:42 PST 2017


Prang,
  Can you confirm your Lustre build was built against MOFED?

  If you check “dmesg” do you see lnet symbol errors?

  http://wiki.lustre.org/Compiling_Lustre#Mellanox_InfiniBand

Thanks,
Keith
From: Brett Lee [mailto:brettlee.lustre at gmail.com]
Sent: Sunday, November 12, 2017 7:05 AM
To: Parag Khuraswar <parag_k at citilindia.com>
Cc: Mannthey, Keith <keith.mannthey at intel.com>; lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] error while configuring lnet

Hi Parag - I've been away from this field for awhile, but this message may be the most helpful at this point:

modprobe: ERROR: could not insert 'ko2iblnd': Invalid argument

Last I knew, there were two IB releases - Mellanox/MOFED and Intel (formerly QLogic) - and the Lustre kernel you use needs to be compiled against the IB release that is installed on the node.

So, I see a couple possibilities of why the Lustre IB module will not load:
1.  Since you are using MOFED, the kernel you have may instead support Intel IB (thus the invalid argument).
2.  One of the options provided may not be valid for the module, or may conflict with another provided option:

require_privileged_port=0 use_privileged_port=0 timeout=150 retry_count=7 map_on_demand=32 peer_credits=63 concurrent_sends=63 ntx=32768 credits=32768 fmr_pool_size=8193

As item 1 is probably a binary, and item 2 a matrix, confirming that the kernel that you have supports the module that you are trying to load seems to be a better starting point. :)

As Intel HPDD seems to provide a kernel that matches the version are using, item 1 seems likely.
https://downloads.hpdd.intel.com/public/lustre/lustre-2.9.0/el7.3.1611/server/RPMS/x86_64/
If that is the case, a couple options include using the Intel IB, or compiling a Lustre kernel that supports the Mellanox OFED.

Hope this helps.
Brett

On Sun, Nov 12, 2017 at 1:16 AM, Parag Khuraswar <parag_k at citilindia.com<mailto:parag_k at citilindia.com>> wrote:
Hi Brett,

I am using MOFED “MLNX_OFED_LINUX-4.1-1.0.2.0” with kernel “3.10.0-514.el7.x86_64”

o/p of modprobe –v ko2iblnd

[root at mds1 ~]# modprobe -v ko2iblnd
install /usr/sbin/ko2iblnd-probe require_privileged_port=0 use_privileged_port=0 timeout=150 retry_count=7 map_on_demand=32 peer_credits=63 concurrent_sends=63 ntx=32768 credits=32768 fmr_pool_size=8193
insmod /lib/modules/3.10.0-514.el7.x86_64/extra/lustre/net/ko2iblnd.ko require_privileged_port=0 use_privileged_port=0 timeout=150 retry_count=7 map_on_demand=32 peer_credits=63 concurrent_sends=63 ntx=32768 credits=32768 fmr_pool_size=8193
modprobe: ERROR: could not insert 'ko2iblnd': Invalid argument
modprobe: ERROR: Error running install command for ko2iblnd
modprobe: ERROR: could not insert 'ko2iblnd': Operation not permitted
[root at mds1 ~]#


Regards,
Parag


From: Brett Lee [mailto:brettlee.lustre at gmail.com<mailto:brettlee.lustre at gmail.com>]
Sent: Saturday, November , 2017 8:08 PM
To: Parag Khuraswar
Cc: Mannthey, Keith; lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] error while configuring lnet

Hi Parag,

You may need to confirm that the in-kernel IB and the IB in the kernel module "match" (are compatible).  I think that loading the module (`sudo modprobe -v ko2iblnd`) may be sufficient to verify the match (it's been a while, others may correct me).

Please indicate which kernel and which IB you are using.

Brett
--
Protect yourself against cybercrime
PDS Software Solutions
https://www.TrustPDS.com<https://www.trustpds.com/>

On Fri, Nov 10, 2017 at 8:30 PM, Parag Khuraswar <parag_k at citilindia.com<mailto:parag_k at citilindia.com>> wrote:
Hi Keith,

Below errors I am getting while adding lnet and mounting mdt.

dmesg logs while adding lnet
=========================================
[317831.432182] LNetError: 28362:0:(api-ni.c:1861:lnet_startup_lndnet()) Can't load LND o2ib, module ko2iblnd, rc=256
=========================================



dmesg logs while mounting mdt
==========================================
[290476.172602] LNetError: 23040:0:(api-ni.c:1861:lnet_startup_lndnet()) Can't load LND o2ib, module ko2iblnd, rc=256
[317478.730515] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro
[317480.166277] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[317480.313296] LustreError: 28268:0:(ldlm_lib.c:483:client_obd_setup()) can't add initial connection
[317480.313600] LustreError: 28268:0:(obd_config.c:608:class_setup()) setup MGC10.2.1.204 at o2ib<mailto:MGC10.2.1.204 at o2ib> failed (-2)
[317480.313603] LustreError: 28268:0:(obd_mount.c:202:lustre_start_simple()) MGC10.2.1.204 at o2ib<mailto:MGC10.2.1.204 at o2ib> setup error -2
[317480.313632] LustreError: 28268:0:(obd_mount_server.c:1573:server_put_super()) no obd home-MDT0000
[317480.313635] LustreError: 28268:0:(obd_mount_server.c:132:server_deregister_mount()) home-MDT0000 not registered
[317480.433934] Lustre: server umount home-MDT0000 complete
[317480.433940] LustreError: 28268:0:(obd_mount.c:1504:lustre_fill_super()) Unable to mount  (-2)
==========================================


Regards,
Parag


From: Mannthey, Keith [mailto:keith.mannthey at intel.com<mailto:keith.mannthey at intel.com>]
Sent: Saturday, November , 2017 2:06 AM

To: Parag Khuraswar; lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: RE: [lustre-discuss] error while configuring lnet

If you have ib0 device check dmesg for more hints on what is going wrong.

Thanks,
Keith
From: Parag Khuraswar [mailto:parag_k at citilindia.com<mailto:parag_k at citilindia.com>]
Sent: Friday, November 10, 2017 10:59 AM
To: Mannthey, Keith <keith.mannthey at intel.com<mailto:keith.mannthey at intel.com>>; lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: RE: [lustre-discuss] error while configuring lnet

Hi,

Basically I am trying to add lnet. Deleting is just try whether it is happing or not.
Main is I want to add o2ib network. Which is giving error “invalid argument ”
==================
[root at mds2 ~]# lnetctl net add --net o2ib --if ib0
add:
    - net:
          errno: -22
          descr: "cannot add network: Invalid argument"
================
I am really not able to understand what argument is invalid in my command.
I am able to ping ib0 network

Regards,
Parag


From: Mannthey, Keith [mailto:keith.mannthey at intel.com]
Sent: Friday, November , 2017 10:51 PM
To: Parag Khuraswar; lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: RE: [lustre-discuss] error while configuring lnet

What are you trying to accomplish?

From below:

10.1.1.205 at tcp<mailto:10.1.1.205 at tcp> is on 0 at lo not eno1 and in general you should not need the “—if” option to delete a fabric.

Try: # lnetctl net del --net tcp

Can you do a normal ping over ib0?

“dmesg” can sometime provide greater details about errors like this.

Thanks,
Keith


From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Parag Khuraswar
Sent: Friday, November 10, 2017 9:10 AM
To: lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] error while configuring lnet

Hi,

I am trying to add lnet but getting below error.
======================
[root at mds2 ~]# lnetctl net show
net:
    - net type: lo
      local NI(s):
        - nid: 0 at lo
          status: up
    - net type: tcp
      local NI(s):
        - nid: 10.1.1.205 at tcp<mailto:10.1.1.205 at tcp>
          status: up
[root at mds2 ~]# lnetctl net add --net o2ib --if ib0
add:
    - net:
          errno: -22
          descr: "cannot add network: Invalid argument"
[root at mds2 ~]# lnetctl net del --net tcp --if eno1
del:
    - net:
          errno: -22
          descr: "cannot del network: Invalid argument"
[root at mds2 ~]# lctl list_nids
10.1.1.205 at tcp<mailto:10.1.1.205 at tcp>
[root at mds2 ~]#
====================================


Regards,
Parag



_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20171112/30ec740e/attachment-0001.html>


More information about the lustre-discuss mailing list