[Lustre-discuss] Future of LusterFS?

Mon Apr 26 12:52:10 PDT 2010

On 2010-04-26, at 05:29, Mag Gam wrote:
> Speaking of the future. Is there any more news about SNS? I think
> thats the only thing Lustre is missing to make it "production" ready
> and not just for research labs.

I agree, and this is one of the features that I will be advocating for our next round of development.  That said, there were a lot of features promised in the past, with unrealistic expectations, so we are not going to be discussing/promoting "future" features that are possible to implement, so much as promoting features which have nearly finished implementation.

Having input from users is definitely still useful in determining what features we will work on, so thanks for bringing this up.

> On Fri, Apr 23, 2010 at 12:07 PM, Stuart Midgley <sdm900 at gmail.com> wrote:
>> Yes, we suffer hardware failures.  All the time.  That is sort of the point of Lustre and a clustered file system :)
>> 
>> We have had double-disk failures with raid5 (recovered everything except ~1MB of data), server failures, MDS failures etc.  We successfully recovered from them all.  Sure, it can be a little stressful... but it all works.
>> 
>> If server hardware fails, our file systems basically hangs until we fix it.  Our most common failure is obviously disks... and they are all covered by raid.  Since we have mostly direct attached disk, you have a few minutes downtime of a server while you replace the disk... everything continues as normal when the server comes back.
>> 
>> --
>> Dr Stuart Midgley
>> sdm900 at gmail.com
>> 
>> 
>> 
>> On 23/04/2010, at 18:41 , Janne Aho wrote:
>> 
>>> On 23/04/10 11:42, Stu Midgley wrote:
>>> 
>>>>> Would lustre have issues if using cheap off the shell components or
>>>>> would people here think you need to have high end machines with built in
>>>>> redundancy for everything?
>>>> 
>>>> We run lustre on cheap off the shelf gear.  We have 4 generations of
>>>> cheapish gear in a single 300TB lustre config (40 oss's)
>>>> 
>>>> It has been running very very well for about 3.5 years now.
>>> 
>>> This sounds promising.
>>> 
>>> Have you had any hardware failures?
>>> If yes, how well has the cluster cooped with the loss of the machine(s)?
>>> 
>>> 
>>> Any advice you can share from your initial setup of lustre?
>> 
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.