[Lustre-discuss] external journal raid1 vs. single disk ext journal + hot spare on raid6

Ralf Utermann ralf.utermann at physik.uni-augsburg.de
Fri May 15 05:33:48 PDT 2009


Stuart Marshall wrote:
> Hi All,
> 
> With the upgrade from 1.6.x to 1.8.x we are planning to reconfigure our
> RAID systems.
> 
> The OST RAID hardware are Sun 6140 arrays with 16x500GB SATA disks. 
> Each 6140 tray has one OSS node (Sun X2200 M2).  We have redundant paths
> and ultimately plan a failover strategy.  The MDT will be a RAID 1+0 Sun
> 2540 with 12x73GB SAS disks.
> 
> Each 6140 tray will be configured either as 1 or 2 RAID6 volumes.  The
> lustre manual recommends more smaller OST's over large and other docs
> I've seen seem to indicate that the optimal number of drives is ~(6+2). 
> For these 16 disk trays, the choice would be one (12+2R6) + external
> journal and/or hot spares or two (5+2R6)'s + ext. jrnl and/or hot spares.
> 

We have a similar hardware setup, 2 OSS nodes attached to a Sun 6140 plus
one CSM200 extension tray, which means 32x500 SATA disks. Because I assumed,
as Robin says in his post, 2^n+parity to be optimal for this hardware, I went
back to Raid5 for the OSTs and configured  2 x 4+1 and 2 x 8+1. Then there
is one Raid1 for external journals and 2 disks left as hot spare. So the OSTs
are not of the same size, but each OSS then serves one 4+1 and one 8+1 OST.
I hope Lustre will spread the data in a reasonable way. The chunksizes used
are 256k and 128k, so a stripe always adds up to 1M. 


> So my questions are:
> 
> 1.) What are the trade-offs of RAID1 external journal with no hot spare
> vs. single disk ext journal with a hot spare (spare is for R6 volume)?
> Specifically:
> 
> - If a single disk external journal is lost, can we run fsck and only
> lose the transactions that have not been committed to disk?  If so, then
> the loss of the disk hosting the external journal would not be
> catastrophic for the file system as a whole.
> 
> - How comfortable are RAID6 users with no hot spares? (We'll have cold
> spares handy, but prefer to get through weekends w/out service)
> 
> 2.) The external journal only takes up ~400MB.  If we create 2 RAID6
> volumes, can we put 2 external journals on one disk or RAID1 set
> (suitably partitioned), or do we need to blow an entire disk for one
> external journal?
we have the 4 journal volumes on one Raid1 virtual disk, but I did not 
compare to other setups with perfomance tests.

I did some performance tests with iozone in our dual-Gigabit environment,
and I see the performance going down significantly with smaller block sizes for
patchless Lustre clients. This is seen for some OSTs, but not for others.
I don't know, whether this has something to do with the 6140 and it's setup
here; the patched clients don't see this problem and I did not look further
into it.

Best regards, Ralf
-- 
        Ralf Utermann
_____________________________________________________________________
        Universität Augsburg, Institut für Physik   --   EDV-Betreuer
        Universitätsstr.1             
        D-86135 Augsburg                     Phone:  +49-821-598-3231
        SMTP: Ralf.Utermann at Physik.Uni-Augsburg.DE         Fax: -3411



More information about the lustre-discuss mailing list