[Lustre-discuss] Problems with failover

Fri Jan 4 02:04:50 PST 2008

Data protection and business continuity come at a price both money wise and
performance wise. 

The architecture and components of a Lustre system should be determined by
deciding about what components failures system should be able to tolerate. 

If you want to tolerate OSS server failure you will need 

1)      2x OSS clustered server

2)      More complex installation

3)      FC/SAS non-caching HBA's

4)      Shared disk with NV Ram capability

5)      Less performance per dollar

If you want to tolerate disk failures 

1)      Soft or Hard raid on Lun's. 

2)      Raid costs some performance

If you want to tolerate tray failure 

1)      Vertical raid sets. 

2)      However if a tray fails in a vertical raid 5 setup, this will put a
lot of problem during rebuilding. 

If you want to tolerate a entire storage system failover

1)      Multiple storage systems behind OSS (pair) You might use software
raid across individual disk systems

2)      Rebuild will take a lot of time (ZFS will offer dirty time logging
and will cure this)

If you want to tolerate OSS system fail

1)      You need to wait Lustre network raid , however this will put some
pressure on clients. 

Today, distributed disk systems do not offer much advantage when used with
Lustre. Replication across a limited bandwith and coharancy problems limits
the performance. And the cost is high. 

Lustre systems are normally tuned for performance not HA . And Lustre can
tolerate a OSS fail without much problem. So it might not be wise to spent a
lot of money on Lustre HA when only you are going to gain is a 1-2 hours
more availability per year. 

Best regards

Mertol

From: lustre-discuss-bounces at clusterfs.com
[mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of Joe Kraska
Sent: 04 Ocak 2008 Cuma 05:59
To: Jeremy Mann; lustre-discuss at clusterfs.com
Subject: Re: [Lustre-discuss] Problems with failover

You currently need another mechanism (hardware or software RAID) to 
provide data redundancy in case of disk failure.  We are working to
provide data replication at the Lustre level, but that is not yet
available.

I should say. That technology has me pretty excited. Right now, unless I
bend over backwards 
and do something like "vertical" RAID stripe/mirrors across multiple disk
trays in a storage cluster,
I can end up with a very bad situation if I lose an entire tray. This can
have a potentially devastating 
impact on my entire storage tier.

A few companies here and there (XIV, Isilon) are starting to abandon
hardware raid and are doing
block replication across the entire storage cluster. With that, I can forget
worrying about specific 
disks (except to replace them), and don't even have to worry about whole
trays (insofar as I have
spare capacity).

This is a pretty neat capability. If you add to it the ability to
"rebalance" your cluster on the fly as 
new nodes are added, what you end up with is a self-healing storage cluster.
Pretty compelling
for those availability figures, and can help with the disk-service pattern
as well.

Joe Kraska
San Diego CA
USA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080104/d6d86dd0/attachment.htm>