<div dir="ltr">On my systems that use standard ethernet (im in the cloud), 2.9 reboots I have no issues I can see. I did have issues with the lnet driver not being able to grab the port on boot-up so I backported the lnet systemd unit file from 2.10 to get around that. </div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 10, 2017 at 9:44 AM, Ben Evans <span dir="ltr"><<a href="mailto:bevans@cray.com" target="_blank">bevans@cray.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Are the Infiniband drivers disappearing first? I know that used to be an<br>
issue.<br>
<br>
-Ben<br>
<br>
On 8/10/17, 8:59 AM, "lustre-discuss on behalf of Michael Di Domenico"<br>
<<a href="mailto:lustre-discuss-bounces@lists.lustre.org">lustre-discuss-bounces@lists.<wbr>lustre.org</a> on behalf of<br>
<div class="HOEnZb"><div class="h5"><a href="mailto:mdidomenico4@gmail.com">mdidomenico4@gmail.com</a>> wrote:<br>
<br>
>does anyone else have issues with issue 'reboot' while having a lustre<br>
>mount?<br>
><br>
>we're running v2.9 clients on our workstations, but when a user goes<br>
>to reboot the machine (from the gui) the system stalls under systemd<br>
>while i presume it's attempting to unmount the filesystem.<br>
><br>
>what i see on the console is; systemd kicks in and starts unmounting<br>
>all the nfs shares we have, works fine. but then it gets to lustre<br>
>and starts throwing connection errors on the console. it's almost as<br>
>if systemd raced itself stopping lustre, whereby lnet got yanked out<br>
>from under the mount before the unmount actually finished.<br>
><br>
>after five minutes or so, it looks like systemd threw in the towel and<br>
>gave up trying to unmount, but the system is stuck still trying to<br>
>execute more shutdown tasks.<br>
><br>
>when we mount lustre on the workstations, i have a script that figures<br>
>some stuff out, issues a service lnet start, and then issues a mount<br>
>command. this all works fine, but i'm not sure if that's why systemd<br>
>can't figure out what to do correctly.<br>
><br>
>and since this is during a shutdown phase, debugging this is<br>
>difficult. any thoughts?<br>
>_____________________________<wbr>__________________<br>
>lustre-discuss mailing list<br>
><a href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.<wbr>org</a><br>
><a href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org" rel="noreferrer" target="_blank">http://lists.lustre.org/<wbr>listinfo.cgi/lustre-discuss-<wbr>lustre.org</a><br>
<br>
______________________________<wbr>_________________<br>
lustre-discuss mailing list<br>
<a href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.<wbr>org</a><br>
<a href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org" rel="noreferrer" target="_blank">http://lists.lustre.org/<wbr>listinfo.cgi/lustre-discuss-<wbr>lustre.org</a><br>
</div></div></blockquote></div><br></div>