[lustre-discuss] systemd lnet/rdma conflict

Mohr Jr, Richard Frank rmohr at utk.edu
Thu Jul 16 13:34:34 PDT 2020



> On Jul 16, 2020, at 2:46 PM, Christopher Benjamin Coffey <Chris.Coffey at nau.edu> wrote:
> 
> 
> I'm trying to get lustre , and rdma setup on an el8 system. I can't get systemd to get the two services: lnet, and rdma shutdown correctly without hanging the system. I've tried many things in the rdma.service, and lnet.service files to get them to work correctly but still the issue exists. Here are my service files below. Anyone know how to fix this? 

Yup, ran into the same thing.  See suggestion below.

> 
> ---------
> [Unit]
> Description=lnet management
> 
> Requires=network-online.target
> After=network-online.target rdma.service
> Wants=rdma.service
> 
> ConditionPathExists=!/proc/sys/lnet/
> 
> [Service]
> Type=oneshot
> RemainAfterExit=true
> ExecStart=/sbin/modprobe lnet
> ExecStart=/usr/sbin/lnetctl lnet configure
> ExecStart=/usr/sbin/lnetctl import /etc/lnet.conf
> ExecStop=/usr/sbin/lnetctl lnet unconfigure
> ExecStop=/usr/sbin/lustre_rmmod
> TimeoutStopSec=30
> 
> [Install]
> WantedBy=multi-user.target


Try  adding “BindsTo=rdma.service” to the lnet service file.  This should force the lnet service to be stopped if the rdma service is ever stopped.

—
Rick Mohr
Senior HPC System Administrator
Joint Institute for Computational Sciences
University of Tennessee






More information about the lustre-discuss mailing list