[lustre-discuss] LNET ports and connections

Degremont, Aurelien degremoa at amazon.com
Wed Feb 19 09:31:22 PST 2020


Hi Cory

I'm with 2.10 there and idle disconnect is not available in 2.10.
The steps I'm thinking of so far are:

1. - Client connects, opens a TCP socket to server
2. - Client acquires a LDLM lock
3. - TCP connection gets broken
4. - Soon after a conflicting lock is enqueued. Server needs to cancel the lock from Client.
5. - Server tries to send a LDLM callback through LNET
6. - LNET initiates a TCP connection in the other way, as the existing socket is no more there.

If time between 2. and 4. is long enough, either the client will have time to reconnect (next ping?) or either the server will evict the client because it did not get any message from it for a while.

This is difficult to reproduce as I don't know how to force close the socket socklnd has opened.


Aurélien

Le 19/02/2020 18:19, « Spitz, Cory James » <cory.spitz at hpe.com> a écrit :

    Hello, Aurélien.  I'm guessing that if you have modern Lustre then idle clients may disconnect, and so you might regularly see Lustre servers initiate the socket connection again.   I'm not sure how to show that that it is the case or not.  Perhaps someone else can chime in on whether that could be it and if so, how to prove it.
    
    -Cory
    
    
    On 2/19/20, 2:35 AM, "lustre-discuss on behalf of Degremont, Aurelien" <lustre-discuss-bounces at lists.lustre.org on behalf of degremoa at amazon.com> wrote:
    
        Thanks! That's really interesting.
        Do you have a code pointer that could show where the code will establish this connection if missing?
        
        Le 18/02/2020 23:34, « NeilBrown » <neilb at suse.com> a écrit :
        
            
            It is not true that:
               LNET will established connections only if asked for by upper layers.
            
            or at least, not in the sense that the upper layers ask for a
            connection.
            Lustre knows nothing about connections.  Even LNet doesn't really know
            about connections. It is only at the socklnd level that connections mean
            much.
            
            Lustre and LNet are message-passing protocols.
            Lustre asks LNet to send a message to a given peer, and gives some
            details of the sort of reply to expect.
            LNet chooses a route and thus a network interface, and asked the LND to
            send the message.
            The socklnd LND will see if it already has a TCP connection.  If it
            does, it will use it.  If not, it will create one.
            
            So yes : it is exactly:
              possible that the server in this case opens the connection itself
              without waiting for the client to reconnect?
            
            NeilBrown
            
            
            On Tue, Feb 18 2020, Aurelien Degremont wrote:
            
            > Thanks for your reply.
            > I think I have a good enough understanding of LNET itself. My question was more about how LNET is being used by Lustre itself.
            >
            > LNET will established connections only if asked for by upper layers. 
            > When I was talking about client and server, I was talking about how Lustre was using it.
            >
            > As far as I understood, Lustre server only contact clients when they need to send LDLM callbacks.
            > They do so through the socket already opened by the client (reverse import).
            > What happened if the socket is closed is what I'm not sure. I though the server is rather waiting for the client to reconnect and if not, is more or less evicting it.
            > Could it be possible that the server in this case opens the connection itself without waiting for the client to reconnect?
            >
            >
            > Aurélien
            >
            > Le 18/02/2020 05:42, « NeilBrown » <neilb at suse.com> a écrit :
            >
            >     
            >     LNet is a peer-to-peer protocol, it has no concept of client and server.
            >     If one host needs to send a message to another but doesn't already have
            >     a connection, it creates a new connection.
            >     I don't yet know enough specifics of the lustre protocol to be certain
            >     of the circumstances when a lustre server will need to initiate a message
            >     to a client, but I imagine that recalling a lock might be one.
            >     
            >     I think you should assume that any LNet node might receive a connection
            >     from any other LNet node (for which they share an LNet network), and
            >     that the connection could come from any port between 512 and 1023
            >     (LNET_ACCEPTOR_MIN_PORT to LNET_ACCEPTOR_MAX_PORT).
            >     
            >     NeilBrown
            >     
            >     
            >     
            >     On Mon, Feb 17 2020, Degremont, Aurelien wrote:
            >     
            >     > Hi all,
            >     >
            >     > From what I've understood so far, LNET listens on port 988 by default and peers connect to it using 1021-1023 TCP ports as source ports.
            >     > At Lustre level, servers listen on 988 and clients connect to them using the same source ports 1021-1023.
            >     > So only accepting connections to port 988 on server side sounded pretty safe to me. However, I've seen connections from 1021-1023 to 988, from server hosts to client hosts sometimes.
            >     > I can't understand what mechanism could trigger these connections. Did I miss something?
            >     >
            >     > Thanks
            >     >
            >     > Aurélien
            >     >
            >     > _______________________________________________
            >     > lustre-discuss mailing list
            >     > lustre-discuss at lists.lustre.org
            >     > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
            >     
            
        
        _______________________________________________
        lustre-discuss mailing list
        lustre-discuss at lists.lustre.org
        http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
        
    
    



More information about the lustre-discuss mailing list