[Lustre-discuss] high IOPS

Andreas Dilger adilger at sun.com
Wed Dec 2 11:09:49 PST 2009

On 2009-12-02, at 09:20, Francois Chassaing wrote:
> I have a big fundamental question :
> if the load that I'll put on the FS is more IOPS-intensive than  
> throughput-intensive (because I'll access lots of medium-sized files  
> ~5 MB from a small number of clients), should I better go Lustre or  
> PVFS2 ?

I don't think PVFS2 is necessarily better at IOPS than Lustre.  This  
is mostly dependent upon the storage configuration.

> Also, if the main load is IOPS, shouldn't I oversize MDS/MDT in  
> terms of CPU/RAM and storage perf (ie. : max of 15K SAS RAID10  
> spindles possible) ?

The Lustre MDS/MDT is used only at file lookup/open/close, but is not  
involved during actual IO operations.  Still, this means in your case  
that the MDS is getting 2 RPCs (open + close, which can be done  
asynchronously in memory) for every 5 OST RPCs (5MB read/write, which  
happen synchronously), so the MDS will definitely need to scale but  
not necessarily at 2/5 of the total OST size.

Typical numbers for a high-end MDT node (16-core, 64GB of RAM, DDR IB)  
is about 8-10k creates/sec, up to 20k lookups/sec from many clients.

Depending on the number of files you are planning to have in the  
filesystem, I would suggest SSDs for the MDT filesystem, especially if  
you have a large working set and are doing read-mostly access.

> on the budget side, may I use asynchronous DRBD to mirror MDT  
> (internal storage), or should I only got a good shared storage  
> (direct or iscsi) ?

Some people on this list have used DRBD, but we haven't tested it  
ourselves.  I _suspect_ (though have not necessarily tested this) that  
if you are using DRBD it would be possible to have lower-performance  
storage on the backup server without significantly impacting the  
primary server performance, if you are willing to run slower in the  
rare case when you are failed-over to the backup.

> Today I'm leaning towards Lustre, because I've tested it against  
> glusterfs, and gluster performed little less good than lustre but  
> poorly failed the bonnie++ create/delete tests. Also I didn't gave a  
> shot at PVFS2 yet...

Cheers, Andreas
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

More information about the lustre-discuss mailing list