[Lustre-discuss] OST crash with group descriptors
Mag Gam
magawake at gmail.com
Fri Mar 13 04:22:35 PDT 2009
This helps a lot. A real world scenario with real answers!
thanks Megan.
On Thu, Mar 12, 2009 at 11:22 PM, Ms. Megan Larko <dobsonunit at gmail.com> wrote:
> Yay! I believe I can answer this one.
>
> On Thu, Mar 12, 2009 at 9:08 PM, Mag Gam <magawake at gmail.com> wrote:
>> This was a very interesting thread to read. I too have been in the
>> same situation and it really stunk! I just went ahead and restored the
>> filesystem 10T :-(
>>
>> Seeing Andreas at work is art :-)
>
> Very true.
>>
>> I have a question about this:
>>
>> Would the OP get 5/6 of his DATA or FILES? 5/6 of DATA is useless!
>> However, 5/6 of Files is amazing. I was under the impression the file
>> would even be striped across (even if you don't enable striping).
>
> If one uses the lustre default striping of 1, then one may retrieve
> 5/6 of the files.
>
> In our case, we set-up lustre with its default stripe value of one, so
> when the files were written out each file went to one array of disks
> seen by the RAID controller (disks were in essentially dumb JBOD
> enclosures). We had two such enclosures fail (Well, one failed and
> the second was an "Ooops" thinking it was the failed unit; JBOD
> hardware really is not that bad). The damaged OSTs were de-activated
> per Lustre Manual (lctl---get NID and deactivate specific NID). The
> remaining OSTs were mounted and if I remember correctly the array was
> mounted on a Lustre client. The NID de-activation would cause a quick
> "EIO"--or such combination of letters--to skip attempting any access
> on the de-activated NIDs and continue to operate (be that search or
> copy) on the remaining parts of the system. The value stripe=1 causes
> Lustre to put an entire file onto one OST. I understand that this is
> both a little slower and can use up disk space less efficiently than
> striping. As we did not have a good data back-up strategy (we're
> improving that now), we felt the striping of one to be our safest
> approach to preserve file integrity.
>
> I hope this helps Mag. Anyone on List, please correct me where I
> have made inaccurate statements.
>>
>> TIA
>
> megan
>>
>>
>>
>>
>> On Tue, Mar 10, 2009 at 11:57 AM, Ms. Megan Larko <dobsonunit at gmail.com> wrote:
>>> Hi T.H.,
>>>
>>> I do not envy your situation. I have been in a very similar
>>> scenario. Andreas Dilger gave me some very good information on
>>> deactivating the bad OST and then copying the remaining good files.
>>> It worked for me.
>>>
>>> The thread is archived in cyber-space under:
>>> http://osdir.com/ml/file-systems.lustre.user/2008-06/msg00249.html
>>>
>>> Good Luck,
>>> megan
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>
More information about the lustre-discuss
mailing list