[lustre-discuss] LNET ports and connections

Degremont, Aurelien degremoa at amazon.com
Thu Feb 20 09:18:26 PST 2020


Thanks. It feels like the theory is valid.
Ideally to confirm I would need a way to manually force close the socklnd socket to force the other peer to re-established it.
Could not find a way to do it for socket opened by kernel threads.

Le 19/02/2020 23:12, « NeilBrown » <neilb at suse.com> a écrit :

    
    When LNet wants to send a message over a SOCKLND interface,
    ksocknal_launch_packet() is called.
    
    This calls ksocknal_launch_all_connections_locked()
    This loops over all "routes" to the "peer" to make sure they all have
    "connections".
    If it finds a route without a connection (returned by
    ksocknal_find_connectable_route_locked()) it calls
    ksocknal_launch_connection_locked() which adds the connection request to
    ksnd_connd_routes, and wakes up the connd.  The connd thread will then
    make the connection.
    
    Hope that helps.
    
    NeilBrown
    
    
    
    On Wed, Feb 19 2020, Degremont, Aurelien wrote:
    
    > Thanks! That's really interesting.
    > Do you have a code pointer that could show where the code will establish this connection if missing?
    >
    > Le 18/02/2020 23:34, « NeilBrown » <neilb at suse.com> a écrit :
    >
    >     
    >     It is not true that:
    >        LNET will established connections only if asked for by upper layers.
    >     
    >     or at least, not in the sense that the upper layers ask for a
    >     connection.
    >     Lustre knows nothing about connections.  Even LNet doesn't really know
    >     about connections. It is only at the socklnd level that connections mean
    >     much.
    >     
    >     Lustre and LNet are message-passing protocols.
    >     Lustre asks LNet to send a message to a given peer, and gives some
    >     details of the sort of reply to expect.
    >     LNet chooses a route and thus a network interface, and asked the LND to
    >     send the message.
    >     The socklnd LND will see if it already has a TCP connection.  If it
    >     does, it will use it.  If not, it will create one.
    >     
    >     So yes : it is exactly:
    >       possible that the server in this case opens the connection itself
    >       without waiting for the client to reconnect?
    >     
    >     NeilBrown
    >     
    >     
    >     On Tue, Feb 18 2020, Aurelien Degremont wrote:
    >     
    >     > Thanks for your reply.
    >     > I think I have a good enough understanding of LNET itself. My question was more about how LNET is being used by Lustre itself.
    >     >
    >     > LNET will established connections only if asked for by upper layers. 
    >     > When I was talking about client and server, I was talking about how Lustre was using it.
    >     >
    >     > As far as I understood, Lustre server only contact clients when they need to send LDLM callbacks.
    >     > They do so through the socket already opened by the client (reverse import).
    >     > What happened if the socket is closed is what I'm not sure. I though the server is rather waiting for the client to reconnect and if not, is more or less evicting it.
    >     > Could it be possible that the server in this case opens the connection itself without waiting for the client to reconnect?
    >     >
    >     >
    >     > Aurélien
    >     >
    >     > Le 18/02/2020 05:42, « NeilBrown » <neilb at suse.com> a écrit :
    >     >
    >     >     
    >     >     LNet is a peer-to-peer protocol, it has no concept of client and server.
    >     >     If one host needs to send a message to another but doesn't already have
    >     >     a connection, it creates a new connection.
    >     >     I don't yet know enough specifics of the lustre protocol to be certain
    >     >     of the circumstances when a lustre server will need to initiate a message
    >     >     to a client, but I imagine that recalling a lock might be one.
    >     >     
    >     >     I think you should assume that any LNet node might receive a connection
    >     >     from any other LNet node (for which they share an LNet network), and
    >     >     that the connection could come from any port between 512 and 1023
    >     >     (LNET_ACCEPTOR_MIN_PORT to LNET_ACCEPTOR_MAX_PORT).
    >     >     
    >     >     NeilBrown
    >     >     
    >     >     
    >     >     
    >     >     On Mon, Feb 17 2020, Degremont, Aurelien wrote:
    >     >     
    >     >     > Hi all,
    >     >     >
    >     >     > From what I've understood so far, LNET listens on port 988 by default and peers connect to it using 1021-1023 TCP ports as source ports.
    >     >     > At Lustre level, servers listen on 988 and clients connect to them using the same source ports 1021-1023.
    >     >     > So only accepting connections to port 988 on server side sounded pretty safe to me. However, I've seen connections from 1021-1023 to 988, from server hosts to client hosts sometimes.
    >     >     > I can't understand what mechanism could trigger these connections. Did I miss something?
    >     >     >
    >     >     > Thanks
    >     >     >
    >     >     > Aurélien
    >     >     >
    >     >     > _______________________________________________
    >     >     > lustre-discuss mailing list
    >     >     > lustre-discuss at lists.lustre.org
    >     >     > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
    >     >     
    >     
    



More information about the lustre-discuss mailing list