[Lustre-discuss] OST crash with group descriptors

Mag Gam magawake at gmail.com
Fri Mar 13 04:22:35 PDT 2009


This helps a lot. A real world scenario with real answers!

thanks Megan.


On Thu, Mar 12, 2009 at 11:22 PM, Ms. Megan Larko <dobsonunit at gmail.com> wrote:
> Yay!  I believe I can answer this one.
>
> On Thu, Mar 12, 2009 at 9:08 PM, Mag Gam <magawake at gmail.com> wrote:
>> This was a very interesting thread to read. I too have been in the
>> same situation and it really stunk! I just went ahead and restored the
>> filesystem 10T :-(
>>
>> Seeing Andreas at work is  art :-)
>
> Very true.
>>
>> I have a question about this:
>>
>> Would the OP get 5/6 of his DATA or FILES? 5/6 of DATA is useless!
>> However, 5/6 of Files is amazing.  I was under the impression the file
>> would even be striped across (even if you don't enable striping).
>
> If one uses the lustre default striping of 1, then one may retrieve
> 5/6 of the files.
>
> In our case, we set-up lustre with its default stripe value of one, so
> when the files were written out each file went to one array of disks
> seen by the RAID controller (disks were in essentially dumb JBOD
> enclosures).  We had two such enclosures fail (Well, one failed and
> the second was an "Ooops" thinking it was the failed unit; JBOD
> hardware really is not that bad).  The damaged OSTs were de-activated
> per Lustre Manual (lctl---get NID and deactivate specific NID).  The
> remaining OSTs were mounted and if I remember correctly the array was
> mounted on a Lustre client.  The NID de-activation would cause a quick
> "EIO"--or such combination of letters--to skip attempting any access
> on the de-activated NIDs and continue to operate (be that search or
> copy) on the remaining parts of the system.  The value stripe=1 causes
> Lustre to put an entire file onto one OST.   I understand that this is
> both a little slower and can use up disk space less efficiently than
> striping.   As we did not have a good data back-up strategy (we're
> improving that now), we felt the striping of one to be our safest
> approach to preserve file integrity.
>
> I hope this helps Mag.   Anyone on List, please correct me where I
> have made inaccurate statements.
>>
>> TIA
>
> megan
>>
>>
>>
>>
>> On Tue, Mar 10, 2009 at 11:57 AM, Ms. Megan Larko <dobsonunit at gmail.com> wrote:
>>> Hi T.H.,
>>>
>>> I do not envy your situation.   I have been in a very similar
>>> scenario.   Andreas Dilger gave me some very good information on
>>> deactivating the bad OST and then copying the remaining good files.
>>> It worked for me.
>>>
>>> The thread is archived in cyber-space under:
>>> http://osdir.com/ml/file-systems.lustre.user/2008-06/msg00249.html
>>>
>>> Good Luck,
>>> megan
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>



More information about the lustre-discuss mailing list