[Lustre-discuss] Performance Expectations of Lustre

Tue Jan 27 10:29:57 PST 2009

On Mon, 2009-01-26 at 19:54 +0100, Nick Jennings wrote:

>   Aha, OK well then that's good to know. There's also some kind of 
> read-ahead and client side caching right?

Indeed.  Both of those exist.

> So files which are accessed a 
> lot will be faster to access.

Yes, unless locks get revoked and the cache has to be flushed and/or
invalidated.  i.e. one client cannot cache (a portion of a file) that
another client updates, for obvious reasons.

> HA is definitely critical, if the storage pool becomes inaccessible we 
> loose clients (and all fingers point at me!).

Usual case.

> So for starters, what can I get away with here? 1 OSS, 1MDS & 1 Client 
> node? Is it a smart thing to do to have the MDS and OSS share the same 
> storage target (just a separate partition for the MDS)?

It's less than ideal.  You will have the MDS and OSS competing for
resources in the failover case.

> What kind of 
> system specs are advisable for each type (MDS, OSS & Client node) as far 
> as RAM, CPU, disk configuration etc?

That's completely subjective to the performance requirements you have.
Lots of RAM is good on the MDS for caching and soon, lots of RAM will be
good for caching on the OSS too.  And lots of RAM on the clients are
good also.  Lots of RAM everywhere.  :-)  OSS CPU requirements are
usually quite modest.  The MDS is helped by some CPU though.

> Also, is it possible to add more 
> OSS' to take over existing OSTs that another OSS was previously 
> managing?

Sure.

>  ie. if I have the MD3000i split into 5x1TB volumes (5xOSTs), 
> and the OSS is getting hammered, I set another OSS up and hand off 2 or 
> 3 OSTs from the old OSS to the new one, and set it up as failover for 
> the remaining OSTs. Do-able?

Most definitely.  You will just need to regenerate the config so that
the clients know where they have been moved to.

> I see, so from the get-go I'm going to need an internal gigE network for 
> OSS/Client communication.

Yeah.

> Is it safe to say my bottleneck is going to be the OSS & not the 
> network?

I guess that depends on the quality of your Gige.  If you assume, say
80% of the Gige bandwidth, that's 100MB/s, yes?  Depending on how many
disks you give the OSS and what kind of interconnect you use to the
disk, and what kind of bus you put the HBA and Gige cards into, you
could certainly wind up with a network bottleneck.

> Is there some documentation I can read about typical setups, 
> usage cases & methods for optimal performance?

Well, the ops manual is probably a good place to start.
manual.lustre.org.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090127/feaf20ad/attachment.pgp>