[Lustre-discuss] Lustre with 10GbE or Infiniband?

Kevin Van Maren Kevin.Vanmaren at Sun.COM
Wed Feb 11 17:46:39 PST 2009


Charles Taylor wrote:
> On Feb 11, 2009, at 4:35 PM, Scott Atchley wrote:
>
>   
>> To add to Brian's comments, IB 4X SDR is limited to about 700-750 MB/s
>> by the fabric. O2IBLND cannot go faster than minimum of either the
>> fabric or PCI-E connection allow.
>>     
>
> Hmmm.   I can agree with the second part of that statement but I  
> question the first.   We've measured much closer to the 1GByte/sec  
> wire rate of IB using several different tools.  750 GBytes/sec   
> corresponds to roughly 6 GBits/sec.   You lose 2 of the 10 Gbits to  
> encoding (8B10) so line rate is really 8GBits/sec or 1 GByte/sec.      
> Yes, you'll lose some more to protocol and swtiching overhead but it  
> is not anywhere near an additional 2 GBits/sec - in our experience.
>   

Correct.  Infinipath SDR was getting ~980 MB/s, and DDR HCAs in SDR mode
can also do quite well in an x8 PCIe slot.

The PCI-X HCAs were limited to around 850MB/s by the bus, and PCIe HCAs
_are_ likewise limited to around 700-750MB/s -- but only in a PCIe x4 slot.

DDR IB (unless using a PCIe gen2 connectX card, or a x16 Infinipath 
card) are also
limited to around 1450-1600 MB/s by the PCIe x8 bus, with a wire speed 
of 2000 MB/s.

QDR IB, in a Gen2 x8 PCIe slot, are also going to be limited to << 
4000MB/s line rate
(should expect around twice the BW of the gen1 PCIe slots).

The IB headers are very small, compared to a 2KB or 4KB packet size, but 
the PCIe
headers (and eg flow-control overhead) are quite large compared to a 
typical 256B packet size.

To clarify one point: IB advertises the "signaling" rate, so the 10Gb 
includes the overhead
bits, as 8 bits are encoded in a 10 bit representation for 
transmission.  So 10Gb/s = 1GB/s,
with 10-bit bytes.  Ethernet, on the other hand, always advertises the 
"data" rate, so 10Gb
Ethernet is 1.25GB/s (12.5Gb/s signaling rate), as there are 8 bits in a 
byte.  Ethernet packet
headers are also effectively a bit larger than for IB (with IFG, 
preamble, etc).

Kevin

> Just ran a quick IMB (formerly Pallas) between a couple of our SDR  
> nodes and got 860 MBytes/sec (ping-pong, 4MB).   So I don't think  
> there is anything inherent in SDR IB that limits you to 750 MBytes/ 
> sec.   However, running IPoIB will  probably limit you to something  
> even less than that which is why you should use the O2IBLND if you  
> want the real benefit of IB.
>
> Just our experience,
>
> Charlie Taylor
> UF HPC Center
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>   




More information about the lustre-discuss mailing list