[lustre-discuss] lnet routing issue - 2.12.5 client with 2.10.3 server

Mark Lundie mark.lundie at manchester.ac.uk
Tue Dec 1 04:58:27 PST 2020


Hi Aurélien,

Many thanks! Sorry I missed that. I'll try disabling discovery as suggested.

Thanks,

Mark
________________________________
From: Degremont, Aurelien <degremoa at amazon.com>
Sent: 01 December 2020 12:42
To: Mark Lundie <mark.lundie at manchester.ac.uk>; fırat yılmaz <firatyilmazz at gmail.com>
Cc: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] lnet routing issue - 2.12.5 client with 2.10.3 server


This is a known issue, see https://jira.whamcloud.com/browse/LU-11840 and https://jira.whamcloud.com/browse/LU-13548



Aurélien



De : lustre-discuss <lustre-discuss-bounces at lists.lustre.org> au nom de Mark Lundie <mark.lundie at manchester.ac.uk>
Date : mardi 1 décembre 2020 à 13:16
À : fırat yılmaz <firatyilmazz at gmail.com>
Cc : "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Objet : RE: [EXTERNAL] [lustre-discuss] lnet routing issue - 2.12.5 client with 2.10.3 server



CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi Firat,

Thanks for your reply. Apologies if I am being silly here, but there is no route configured for that network. We have the networks tcp (10.110.0.0/16) and tcp1 (10.10.0.0/16). The servers have interfaces on both, but the clients only have an interface on tcp1. I'm not sure why the client is trying to route to 10.110.0.21 at tcp:



client # mount /net/lustre/

mount.lustre: mount hmeta1 at tcp1:hmeta2 at tcp1:/lustre at /net/lustre failed: Input/output error

Is the MGS running?



hmeta1 resolves to 10.10.0.91, on tcp1.



Thanks,



Mark

________________________________

From: fırat yılmaz <firatyilmazz at gmail.com>
Sent: 01 December 2020 11:55
To: Mark Lundie <mark.lundie at manchester.ac.uk>
Cc: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] lnet routing issue - 2.12.5 client with 2.10.3 server



Hi Mark,



[Tue Dec  1 11:07:55 2020] LNetError: 2127:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 10.110.0.21 at tcp from <?>



I would suggest checking  lnetctl routing show and remove the route to  10.110.0.21 at tcp and try to mount.

https://wiki.lustre.org/LNet_Router_Config_Guide







On Tue, Dec 1, 2020 at 2:41 PM Mark Lundie <mark.lundie at manchester.ac.uk<mailto:mark.lundie at manchester.ac.uk>> wrote:

Hi all,



I've just run in to an issue mounting on a newly upgraded client running 2.12.5 with 2.10.3 servers. Just to give some background, we're about to replace our existing Lustre storage, but will run it concurrently with the replacement for a couple of months. We'll be running 2.12.5 server on the new MDS and OSSs and I plan to update all clients to the same version. I would like to avoid updating the existing servers though.



The problem is this. The servers have two tcp LNET networks, tcp and tcp1, on separate subnets and VLANs. The clients only see tcp1 (a small number are also on tcp3, routed via 2 lnet routers), which has been fine until now. With the 2.12.5 client, however, it is trying to mount from tcp. 2.10.3 to 2.12.5 is obviously a bit of a jump, but does anyone have any ideas on what has changed and what I could do here please?



meta# lnetctl net show

net:

    - net type: lo

      local NI(s):

        - nid: 0 at lo

          status: up

    - net type: tcp

      local NI(s):

        - nid: 10.110.0.21 at tcp

          status: up

          interfaces:

              0: bond0.22

    - net type: tcp1

      local NI(s):

        - nid: 10.10.0.91 at tcp1

          status: up

          interfaces:

              0: bond0



meta# lnetctl route show

route:

    - net: tcp2

      gateway: 10.10.0.254 at tcp1

    - net: tcp3

      gateway: 10.10.0.254 at tcp1



client# lnetctl net show

net:

    - net type: lo

      local NI(s):

        - nid: 0 at lo

          status: up

    - net type: o2ib

      local NI(s):

        - nid: 10.12.170.47 at o2ib

          status: up

          interfaces:

              0: ib0

    - net type: tcp1

      local NI(s):

        - nid: 10.10.170.47 at tcp1

          status: up

          interfaces:

              0: em1



[Tue Dec  1 11:07:55 2020] LNetError: 2127:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 10.110.0.21 at tcp from <?>

[Tue Dec  1 11:08:01 2020] LustreError: 1792:0:(mgc_request.c:249:do_config_log_add()) MGC10.10.0.91 at tcp1: failed processing log, type 1: rc = -5

[Tue Dec  1 11:08:08 2020] LustreError: 2169:0:(mgc_request.c:599:do_requeue()) failed processing log: -5

[Tue Dec  1 11:08:19 2020] LNetError: 2127:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 10.110.0.22 at tcp from <?>

[Tue Dec  1 11:08:30 2020] LustreError: 15c-8: MGC10.10.0.91 at tcp1: The configuration from log 'lustre-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.



client# lctl ping 10.10.0.91 at tcp1

12345-0 at lo

12345-10.110.0.21 at tcp

12345-10.10.0.91 at tcp1



Any suggestions will be greatly appreciated!



Many thanks,



Mark



Dr Mark Lundie | Research IT Systems Administrator | Research IT | Directorate of IT Services | B39, Sackville Street Building | The University of Manchester | Manchester | M1 3WE | 0161 275 8403 | ri.itservices.manchester.ac.uk



Working Hours: Tues - Thurs 0730-1730; Fri 0730-1630

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20201201/67c784f2/attachment-0001.html>


More information about the lustre-discuss mailing list