[Lustre-discuss] Multiple IB ports

Andreas Dilger adilger at whamcloud.com
Tue Mar 22 08:30:19 PDT 2011


On 2011-03-22, at 3:30 PM, Mike Hanby wrote:
> I'm curios about the checksums,
> 
> The manual tells you how to turn both types of checksum on or off (client in memory, and wire/network):
> $ echo 0 > /proc/fs/lustre/llite/<fsname>/checksum_pages

This is enabling/disabling the in-memory page checksums, as well as the network RPC checksums.  The assumption is that there is no value in doing the in-memory checksums without the RPC checksums.  It is possible to enable/disable the RPC checksums independently.

> Then it tells you how to check the status of wire checksums:
> $ /usr/sbin/lctl get_param osc.*.checksums
> 
> It's not clear if 0 in the checksum_pages file overrides the osc.*.checksums setting,

Yes, it does.

> or the opposite (assuming the results of the get_param shows all OSTs with "...checksums=1".
> 
> Also, what's the typical recommendation for 1.8 sites? in-memory off and wire on?

The default is in-memory off, RPC checksums on, which is recommended.  The only time I suggest disabling the RPC checksums is if single-threaded IO performance is a bottleneck for specific applications, and disabling the checksum CPU usage is a significant performance boost.

> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Peter Kjellström
> Sent: Tuesday, March 22, 2011 7:24 AM
> To: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Multiple IB ports
> 
> On Tuesday, March 22, 2011 06:15:35 am Atul Vidwansa wrote:
>> Hi Brian,
>> 
>> With one 4x QDR IB port, you can achieve 2 GB/Sec on single client,
>> multi-threaded workload provided that you have right storage (with enough
>> bandwidth) at other end.  We have tested this multiple times at DDN.
>> 
>> I have seen sites that do IB-bonding across 2 ports but mostly in failover
>> configuration. To get 10GB/Sec to a single node requires aggregating 5 QDR
>> IB ports. You will need to confirm from your IB vendor (Mellanox? ), OS
>> vendor (SGI/RedHat/Novell) and Lustre vendor whether they support
>> aggregating so many links.  I think the challenge you will have is to find
>> a Lustre client node that has enough x8 PCIe slots to sustain 3 dual-port
>> Infiniband adapters at full rate
> 
> Just adding a small detail, a single port of QDR consumes all of the HCAs pci 
> bandwidth so you would need 5 x8 IB HCAs for a total of 40 lanes of pci-
> express. This will of course change with the introduction of future pci-
> express generations...
> 
> /Peter
> 
>> (think multiple such nodes in a typical
>> Lustre filesystem, not so economical). Other alternative is to find a
>> server that can support 8X or 12X QDR IB port on the motherboard to get
>> more bandwidth.
>> 
>> With a typical Lustre client memory of 24-64GB and memory to CPU bandwidth
>> of 10GB/Sec (with standard DDR3-1333MHz  DIMMS), it is not possible to fit
>> dataset larger than 2/3rd  of memory. If you still want to achieve
>> 10GB/Sec of bandwidth between storage and memory, there are clever
>> alternatives. You will have to stage your data into memory beforehand and
>> keep memory pages locked and continue feeding data as these pages are
>> consumed. It is lot harder than it seems on the paper.
>> 
>> Cheers,
>> -Atul
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.






More information about the lustre-discuss mailing list