[lustre-discuss] Lustre 2.12 routing with MR and discovery off

Andreas Dilger adilger at dilger.ca
Sun Aug 30 19:09:19 PDT 2020


On Aug 26, 2020, at 4:37 PM, Faaland, Olaf P. <faaland1 at llnl.gov> wrote:
> 
> Does Lustre 2.12 require that routes for every intermediate network are defined, on every node on a path?
> 
> For example, given this Lustre network, where:
>  A-D are nodes and 1-6 are addresses
>  network tcp2 has only routers, no clients and no servers
> 
> A(1) -tcp1- (2)B(3) -tcp2- (4)C(5) -tcp3- (6)D
> 
> And configured routes:
> 
> A: options lnet routes="tcp3 2 at tcp1"
> B: options lnet routes="tcp3 4 at tcp2"
> C: options lnet routes="tcp1 3 at tcp2"
> D: options lnet routes="tcp1 5 at tcp3"
> 
> With Lustre <= 2.10 we configured only these routes.  The only nodes that need to know tcp2 exist are attached to it, and so there are no routes to tcp2 defined anywhere.
> 
> It looks to me like Lustre 2.12 attempts to send error notifications back to the original sender, and so nodes A and D may end up receiving messages from nids on tcp2.  This then requires nodes A and D to have routes to tcp2 defined, so they can reply to the messages.

This is interesting.  I'm not an LNet expert, but it seems strange to me that
nodes other than "B" and "C" should care about the state of connections within
@tcp2 if they are not endpoints.  They should never be sending messages directly
to those nodes, and the LNet routers B/C knowing which connections/peers are
working should be enough for them to make routing decisions for A and D.

Cheers, Andreas





-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 873 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20200830/7a75a5e4/attachment.sig>


More information about the lustre-discuss mailing list