[Lustre-discuss] non-consecutive OST ordering

Christopher Walker cwalker at fas.harvard.edu
Fri Nov 12 19:20:15 PST 2010


Thanks again for this patch. I just have one quick question about this
-- 1.41.12.2.ora1 seems to require lustre_user.h from 1.8.x -- is OK to
use a version of lfsck compiled against 1.8.x on a 1.6.6 filesystem, and
with {mds,ost}db that were created with 1.41.6?

Best,
Chris

On 11/12/10 3:17 AM, Wang Yibin wrote:
> This is a bug in llapi_lov_get_uuids() which assigns UUID to the wrong OST index when there are sparse OST(s).
> Please file a bug for this.
>
> Before this bug can be fixed, you can apply the following patch to e2fsprogs(version 1.41.12.2.ora1) lfsck.c as a workaround (not verified though).
>
> --- e2fsprogs/e2fsck/lfsck.c   2010-11-12 11:43:42.000000000 +0800
> +++ lfsck.c 2010-11-12 12:14:38.000000000 +0800
> @@ -1226,6 +1226,12 @@
>     __u64 last_id;
>     int i, rc;
>
> +   /* skip empty UUID OST */
> +   if(!strlen(lfsck_uuid[ost_idx].uuid)) {
> +       log_write("index %d UUID is empty(sparse OST index?). Skipping.\n", ost_idx);
> +       return(0);
> +   }
> +
>     sprintf(dbname, "%s.%d", MDS_OSTDB, ost_idx);
>
>     VERBOSE(2, "testing ost_idx %d\n", ost_idx);
> @@ -1279,11 +1284,20 @@
>             ost_hdr->ost_uuid.uuid);
>
>         if (obd_uuid_equals(&lfsck_uuid[ost_idx], &ost_hdr->ost_uuid)) {
> +                        /* must be sparse ost index */
>             if (ost_hdr->ost_index != ost_idx) {
>                 log_write("Requested ost_idx %u doesn't match "
>                       "index %u found in %s\n", ost_idx,
>                       ost_hdr->ost_index, ost_files[i]);
> -               continue;
> +
> +               log_write("Moving the index/uuid to the right place...\n");
> +                /* zero the original uuid entry */
> +               memset(&lfsck_uuid[ost_idx], 0, sizeof(struct obd_uuid));
> +                /* copy it to the right place */
> +                ost_idx = ost_hdr->ost_index;
> +                strcpy(lfsck_uuid[ost_hdr->ost_index].uuid,ost_hdr->ost_uuid.uuid);
> +               /* skip this round */
> +               goto out;
>             }
>
>             break;
>
>
> 在 2010-11-12,上午10:53, Christopher Walker 写道:
>
>> Thanks very much for your reply. I've tried remaking the mdsdb and all
>> of the ostdb's, but I still get the same error -- it checks the first 34
>> osts without a problem, but can't find the ostdb file for the 35th
>> (which has ost_idx 42):
>>
>> ...
>> lfsck: ost_idx 34: pass3 OK (676803 files total)
>> lfsck: can't find file for ost_idx 35
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 36
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 37
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 38
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 39
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 40
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 41
>> Files affected by missing ost info are : -
>> lfsck: can't find file for ost_idx 42
>> Files affected by missing ost info are : -
>> ...
>>
>> e2fsck claims to be making the ostdb without a problem:
>>
>> Pass 6: Acquiring information for lfsck
>> OST: 'aegalfs-OST002a_UUID' ost idx 42: compat 0x2 rocomp 0 incomp 0x2
>> OST: num files = 676803
>> OST: last_id = 858163
>>
>> and with the filesystem up I can see files on this OST:
>>
>> [cwalker at iliadaccess04 P-Gadget3.3.1]$ lfs getstripe predict.c
>> OBDS:
>> 0: aegalfs-OST0000_UUID ACTIVE
>> ...
>> 33: aegalfs-OST0021_UUID ACTIVE
>> 42: aegalfs-OST002a_UUID ACTIVE
>> predict.c
>> obdidx objid objid group
>> 42 10 0xa 0
>>
>>
>> lfsck identifies several hundred GB of orphan data that we'd like to
>> recover, so we'd really like to run lfsck on this array. We're willing
>> to forgo the recovery on the 35th ost, but I want to make sure that
>> running lfsck -l with the current configuration won't make things worse.
>>
>> Thanks again for your reply; any further advice is very much appreciated!
>>
>> Best,
>> Chris
>>
>> On 11/10/10 12:10 AM, Wang Yibin wrote:
>>> The error message indicates that the UUID of OST #35 does not match between the live filesystem and the ostdb file.
>>> Is this ostdb obsolete?
>>>
>>> 在 2010-11-9,下午11:45, Christopher Walker 写道:
>>>
>>>> For reasons that I can't recall, our OSTs are not in consecutive order 
>>>> -- we have 35 OSTs, which are numbered consecutively from
>>>> 0000-0021
>>>> and then there's one last OST at
>>>> 002a
>>>>
>>>> When I try to run lfsck on this array, it works fine for the first 34 
>>>> OSTs, but it can't seem to find the last OST db file:
>>>>
>>>> lfsck: ost_idx 34: pass3 OK (680045 files total)
>>>> lfsck: can't find file for ost_idx 35
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 36
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 37
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 38
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 39
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 40
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 41
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 42
>>>> Files affected by missing ost info are : -
>>>> /n/scratch/hernquist_lab/tcox/tests/SbSbhs_e_8/P-Gadget3.3.1/IdlSubfind/.svn/text-base/ReadSubhaloFromReshuffledSnapshot.pro.svn-base
>>>>
>>>> and then lists all of the files that live on OST 002a.  This db file 
>>>> definitely does exist -- it lives in the same directory as all of the 
>>>> other db files, and e2fsck for this OST ran without problems.
>>>>
>>>> Is there some way of forcing lfsck to recognize this OST db?  Or, 
>>>> failing that, is it dangerous to run lfsck on the first 34 OSTs only?
>>>>
>>>> We're using e2fsck 1.41.6.sun1 (30-May-2009)
>>>>
>>>> Thanks very much!
>>>>
>>>> Chris
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list