[lustre-discuss] Limit to number of OSS?

Andreas Dilger adilger at whamcloud.com
Thu Oct 10 20:35:47 PDT 2019


On Oct 10, 2019, at 11:20, Michael Di Domenico <mdidomenico4 at gmail.com<mailto:mdidomenico4 at gmail.com>> wrote:

On Mon, Oct 7, 2019 at 6:33 PM Andreas Dilger <adilger at whamcloud.com<mailto:adilger at whamcloud.com>> wrote:

With socklnd there are 3 TCP connections per client-server pair.
For IB there is no such connection limit that I'm aware of.

just out of morbid curiosity, can very briefly explain the
connectivity differences between TCP/IB.  Does IB use the same 3
connections as TCP?  If not, is that why the connectivity limit
doesn't exist with IB or is there some other overriding design
principal in IB that allows lustre to push past TCP?  Not that any of
this has any relevance to anything i do, i'm just curious.

i'd love to have 2000 OSS's and 20k clients, but sadly i do not... :(

This is a fundamental difference between TCP and IB.  TCP needs a persistent
connection between peers (socket) to manage state, and the (very ancient) IP
protocol on which TCP is built has a limit of 65536 connections on a single node.
When computers had 1-2MB of RAM that was more than enough...

IB does not have this limitation, though it does consume some memory for each
peer that that it is communicating with.  o2iblnd can establish multiple connections
to a single peer to get better bandwidth, and this is important for OPA performance,
but is not critical for IB networks.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20191011/bf5578e0/attachment.html>


More information about the lustre-discuss mailing list