[Lustre-discuss] high IOPS

Craig Tierney Craig.Tierney at noaa.gov
Wed Dec 2 11:15:56 PST 2009

Andreas Dilger wrote:
> On 2009-12-02, at 09:20, Francois Chassaing wrote:
>> I have a big fundamental question :
>> if the load that I'll put on the FS is more IOPS-intensive than  
>> throughput-intensive (because I'll access lots of medium-sized files  
>> ~5 MB from a small number of clients), should I better go Lustre or  
>> PVFS2 ?
> I don't think PVFS2 is necessarily better at IOPS than Lustre.  This  
> is mostly dependent upon the storage configuration.
>> Also, if the main load is IOPS, shouldn't I oversize MDS/MDT in  
>> terms of CPU/RAM and storage perf (ie. : max of 15K SAS RAID10  
>> spindles possible) ?
> The Lustre MDS/MDT is used only at file lookup/open/close, but is not  
> involved during actual IO operations.  Still, this means in your case  
> that the MDS is getting 2 RPCs (open + close, which can be done  
> asynchronously in memory) for every 5 OST RPCs (5MB read/write, which  
> happen synchronously), so the MDS will definitely need to scale but  
> not necessarily at 2/5 of the total OST size.
> Typical numbers for a high-end MDT node (16-core, 64GB of RAM, DDR IB)  
> is about 8-10k creates/sec, up to 20k lookups/sec from many clients.
> Depending on the number of files you are planning to have in the  
> filesystem, I would suggest SSDs for the MDT filesystem, especially if  
> you have a large working set and are doing read-mostly access.


Has anyone reported results of an SSD based MDT?


>> on the budget side, may I use asynchronous DRBD to mirror MDT  
>> (internal storage), or should I only got a good shared storage  
>> (direct or iscsi) ?
> Some people on this list have used DRBD, but we haven't tested it  
> ourselves.  I _suspect_ (though have not necessarily tested this) that  
> if you are using DRBD it would be possible to have lower-performance  
> storage on the backup server without significantly impacting the  
> primary server performance, if you are willing to run slower in the  
> rare case when you are failed-over to the backup.
>> Today I'm leaning towards Lustre, because I've tested it against  
>> glusterfs, and gluster performed little less good than lustre but  
>> poorly failed the bonnie++ create/delete tests. Also I didn't gave a  
>> shot at PVFS2 yet...
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Craig Tierney (craig.tierney at noaa.gov)

More information about the lustre-discuss mailing list