[lustre-discuss] lnet fails to start on reboot

Mannthey, Keith keith.mannthey at intel.com
Mon Aug 13 14:27:50 PDT 2018


Are you sure the fabric is up when lnet starts at boot?  Double check the order your services start and be sure Lnet waits for the fabric/network before starting. 

 Thanks,
 Keith 



> -----Original Message-----
> From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf
> Of David Rackley
> Sent: Monday, August 13, 2018 2:14 PM
> To: lustre-discuss at lists.lustre.org
> Cc: sciops <sciops at jlab.org>
> Subject: [lustre-discuss] lnet fails to start on reboot
> 
> Hello,
> I  have built and installed lustre client 2.10.4-1 with centos 7.3  (3.10.0-
> 514.el7.x86_64)  and on reboot lnet fails with:
>  root at scissd1801:~] systemctl status lnet.service ● lnet.service - lnet
> management
>    Loaded: loaded (/usr/lib/systemd/system/lnet.service; enabled; vendor
> preset: disabled)
>    Active: failed (Result: exit-code) since Mon 2018-08-13 16:54:31 EDT; 16min
> ago
>   Process: 2334 ExecStart=/usr/sbin/lnetctl import /etc/lnet.conf (code=exited,
> status=254)
>   Process: 2331 ExecStart=/usr/sbin/lnetctl lnet configure (code=exited,
> status=0/SUCCESS)
>   Process: 2071 ExecStart=/usr/sbin/modprobe lnet (code=exited,
> status=0/SUCCESS)  Main PID: 2334 (code=exited, status=254)
> 
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: - net:
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: -100 Aug 13 16:54:31
> scissd1801 lnetctl[2334]: descr: "cannot add network: Network is down"
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: - numa_range:
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: 0 Aug 13 16:54:31 scissd1801
> lnetctl[2334]: descr: "success"
> Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service: main process exited,
> code=exited, status=254/n/a Aug 13 16:54:31 scissd1801 systemd[1]: Failed to
> start lnet management.
> Aug 13 16:54:31 scissd1801 systemd[1]: Unit lnet.service entered failed state.
> Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service failed.
> 
> The /etc/lnet.conf file exists and when I manually execute /usr/sbin/lnetctl
> import /etc/lnet.conf it succeeds and lnet works and I can mount lustre as
> expected.
> 
> Any ideas?
> 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> David Rackley         |      ********   **  **    **  ********  ********
> CC Sci Comp Sys Admin |        **      **  ***   **  **    **  **
> rackley at jlab.org      |       **      **  ** *  **  ********  ******
>                       |      **  *   **  **  * **  **    **  **
> Phone: 757.269.7041   |     **  ******  **   ***  **    **  **
> FAX:   757.269.6248   | TJNAF - Thomas Jefferson National Accelerator Facility
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list