[lustre-devel] Why can you set concurrent_sends < peer_credits ?
hornc at cray.com
Wed Aug 19 14:16:29 PDT 2015
To clarify, the lnet_compare_routes() does look at peer credits, but only if priority, hops and queued nob are all the same. It would probably be better to weight all of these things together as was suggested at one of the developer conferences recently.
On Aug 19, 2015, at 4:14 PM, Chris Horn <hornc at cray.com<mailto:hornc at cray.com>> wrote:
We could more easily help that situation by changing the lnet_compare_routes() method to look at the number of credits available when deciding which router peer to use as a next hop.
On Aug 19, 2015, at 3:54 PM, Alexey Lyashkov <alexey.lyashkov at seagate.com<mailto:alexey.lyashkov at seagate.com>> wrote:
In my invested case, I have see large number tx in sending queue with negative credits. it's mean we don't able to resend these messages via different gateway until message expired. But if we stop to queue messages with reach a zero credits, we will able to send message via different gateway after peer dead event without any notifications to ptlrpc layer. So i think it's likely to be a bug as from my point view, we need to avoid ptlrpc reconnects as possible.
On Wed, Aug 19, 2015 at 11:48 PM, Christopher J. Morrone <morrone2 at llnl.gov<mailto:morrone2 at llnl.gov>> wrote:
LNet does stop sending LNet messages on a peer connection when that peer's credit count reaches zero. LNet chose to then relate the count of messages awaiting credits by using negative values of the same variable. It is just the convention chosen, and doesn't necessarily mean that there is a design problem there.
Alexey Lyashkov · Technical lead for a Morpheus team
Seagate Technology, LLC
lustre-devel mailing list
lustre-devel at lists.lustre.org<mailto:lustre-devel at lists.lustre.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lustre-devel