[lustre-discuss] Lnet not starting

Chris Horn hornc at cray.com
Fri Oct 13 12:24:09 PDT 2017


I’m not sure what you mean by “shows as partial”. I can’t find a (systemd) lustre.service file that is packaged with the community Lustre. Did you create your own? I would say it is good practice to load the modules even though it shouldn’t be strictly necessary. Performing a “mount -t lustre…” should pull in any necessary modules. If that isn’t happening maybe you just need to run depmod.

Chris Horn

From: Ravi Konila <ravibhatk at gmail.com>
Reply-To: Ravi Konila <ravibhatk at gmail.com>
Date: Friday, October 13, 2017 at 9:29 AM
To: Chris Horn <hornc at cray.com>, 'Lustre User Discussion Mailing List' <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lnet not starting

Hi Chris

In continuation with my trailing email,
why service lustre status shows as partial?

Thanks for the info. Finally, I was able to install lustre servers (MGS and MDS) as of now. I used native IB drivers which came with RHEL 6.7
My question is do I need to run modprobe lustre and modprobe lnet everytime lustre server reboots?
What I observed that service lustre start doesnot come up without modprobe lustre.

Any suggestions?

Regards

Ravi Konila
Sr. Technical Consultant
From: Chris Horn
Sent: Friday, October 13, 2017 12:18 AM
To: Ravi Konila ; Parag Khuraswar ; 'Lustre User Discussion Mailing List'
Subject: Re: [lustre-discuss] Lnet not starting

The pre-built rpms are most likely compiled against the in-kernel IB drivers. If you’re using the MOFED drivers you’ll need to recompile Lustre. The instructions here may help you out http://wiki.lustre.org/Compiling_Lustre

Chris Horn

From: Ravi Konila <ravibhatk at gmail.com>
Reply-To: Ravi Konila <ravibhatk at gmail.com>
Date: Thursday, October 12, 2017 at 1:33 PM
To: Chris Horn <hornc at cray.com>, Parag Khuraswar <parag_k at citilindia.com>, 'Lustre User Discussion Mailing List' <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lnet not starting

Hi

I am using pre-built rpms.

Regards

Ravi Konila
From: Chris Horn
Sent: Thursday, October 12, 2017 10:51 PM
To: Ravi Konila ; Parag Khuraswar ; 'Lustre User Discussion Mailing List'
Subject: Re: [lustre-discuss] Lnet not starting

Are you compiling Lustre yourself or using pre-built rpms?

Chris Horn

From: Ravi Konila <ravibhatk at gmail.com>
Reply-To: Ravi Konila <ravibhatk at gmail.com>
Date: Thursday, October 12, 2017 at 11:40 AM
To: Chris Horn <hornc at cray.com>, Parag Khuraswar <parag_k at citilindia.com>, 'Lustre User Discussion Mailing List' <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lnet not starting

Hi Chris

I installed RHEL 6.7, MLNX_OFED_LINUX-3.4-1.0.0.0-rhel6.7-x86_64 and then Lustre 2.8 in my Lustre MDS/MGT/OSS servers.

My ib0 is working fine and I can ping other nodes.
my lustre.conf file has
options lnet networks=o2ib(ib0)

With this, If I run “service lnet start” it fails with error
LNET configure error 22: Invalid argument

dmesg give me output as below (I just captured last line but there are many lines with symbol error or so)

LNetError: 16770:0:(api-ni.c:1276:lnet_startup_lndni()) Can't load LND o2ib, module ko2iblnd, rc=256

If I specify tcp in lustre.conf,, it works fine.

I have reinstalled Lustre and then Mellanox OFED driver but still the problem is same, not able to make infiniband up with Lustre Lnet

Regards

Ravi Konila
Sr. Technical Consultant



From: Chris Horn
Sent: Thursday, October 12, 2017 9:02 PM
To: Ravi Konila ; Parag Khuraswar ; 'Lustre User Discussion Mailing List'
Subject: Re: [lustre-discuss] Lnet not starting

dmesg output should provide more information about the “Invalid argument” error that you are seeing, but my guess would be that Lustre was compiled against a different IB stack than what you have installed.

Chris Horn

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Ravi Konila <ravibhatk at gmail.com>
Reply-To: Ravi Konila <ravibhatk at gmail.com>
Date: Thursday, October 12, 2017 at 7:58 AM
To: Parag Khuraswar <parag_k at citilindia.com>, 'Lustre User Discussion Mailing List' <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lnet not starting

Hi Parag

Even I am facing the same issue with RHEL 6.7 and Lustre 2.8. Now I am also trying with RHEL 7.3 and Lustre 2.10.0.
I am planning to install Mellanox OFED driver with 3.4 stack. Looks like there is some problem with OFED 4.x stack with Lustre 2.10.0.
Let me try the same and update.
When I start “service lnet start” it gives LNET configure error 22: Invalid argument
but it works fine with tcp.

Regards

Ravi Konila
Sr. Technical Consultant
Maruti Suzuki India Ltd


From: Parag Khuraswar
Sent: Thursday, October 12, 2017 6:11 PM
To: 'Lustre User Discussion Mailing List'
Subject: [lustre-discuss] Lnet not starting

Hi,

I am installing Lustre 2.10.0 on RHEL 7.3.
IB is working fine but lnet is not coming up. Lustre service is running.
Ibstat also show link up and active.
Lustre and lnet modules are also loaded.

Regards,
Parag


________________________________
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20171013/be5e5cb5/attachment-0001.html>


More information about the lustre-discuss mailing list