[lustre-discuss] lnet fails to start on reboot
Mannthey, Keith
keith.mannthey at intel.com
Mon Aug 13 14:27:50 PDT 2018
Are you sure the fabric is up when lnet starts at boot? Double check the order your services start and be sure Lnet waits for the fabric/network before starting.
Thanks,
Keith
> -----Original Message-----
> From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf
> Of David Rackley
> Sent: Monday, August 13, 2018 2:14 PM
> To: lustre-discuss at lists.lustre.org
> Cc: sciops <sciops at jlab.org>
> Subject: [lustre-discuss] lnet fails to start on reboot
>
> Hello,
> I have built and installed lustre client 2.10.4-1 with centos 7.3 (3.10.0-
> 514.el7.x86_64) and on reboot lnet fails with:
> root at scissd1801:~] systemctl status lnet.service ● lnet.service - lnet
> management
> Loaded: loaded (/usr/lib/systemd/system/lnet.service; enabled; vendor
> preset: disabled)
> Active: failed (Result: exit-code) since Mon 2018-08-13 16:54:31 EDT; 16min
> ago
> Process: 2334 ExecStart=/usr/sbin/lnetctl import /etc/lnet.conf (code=exited,
> status=254)
> Process: 2331 ExecStart=/usr/sbin/lnetctl lnet configure (code=exited,
> status=0/SUCCESS)
> Process: 2071 ExecStart=/usr/sbin/modprobe lnet (code=exited,
> status=0/SUCCESS) Main PID: 2334 (code=exited, status=254)
>
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: - net:
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: -100 Aug 13 16:54:31
> scissd1801 lnetctl[2334]: descr: "cannot add network: Network is down"
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: - numa_range:
> Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: 0 Aug 13 16:54:31 scissd1801
> lnetctl[2334]: descr: "success"
> Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service: main process exited,
> code=exited, status=254/n/a Aug 13 16:54:31 scissd1801 systemd[1]: Failed to
> start lnet management.
> Aug 13 16:54:31 scissd1801 systemd[1]: Unit lnet.service entered failed state.
> Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service failed.
>
> The /etc/lnet.conf file exists and when I manually execute /usr/sbin/lnetctl
> import /etc/lnet.conf it succeeds and lnet works and I can mount lustre as
> expected.
>
> Any ideas?
>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> David Rackley | ******** ** ** ** ******** ********
> CC Sci Comp Sys Admin | ** ** *** ** ** ** **
> rackley at jlab.org | ** ** ** * ** ******** ******
> | ** * ** ** * ** ** ** **
> Phone: 757.269.7041 | ** ****** ** *** ** ** **
> FAX: 757.269.6248 | TJNAF - Thomas Jefferson National Accelerator Facility
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
More information about the lustre-discuss
mailing list