[Lustre-discuss] non-consecutive OST ordering

Wang Yibin wang.yibin at oracle.com
Fri Nov 12 00:17:35 PST 2010


This is a bug in llapi_lov_get_uuids() which assigns UUID to the wrong OST index when there are sparse OST(s).
Please file a bug for this.

Before this bug can be fixed, you can apply the following patch to e2fsprogs(version 1.41.12.2.ora1) lfsck.c as a workaround (not verified though).

--- e2fsprogs/e2fsck/lfsck.c   2010-11-12 11:43:42.000000000 +0800
+++ lfsck.c 2010-11-12 12:14:38.000000000 +0800
@@ -1226,6 +1226,12 @@
    __u64 last_id;
    int i, rc;

+   /* skip empty UUID OST */
+   if(!strlen(lfsck_uuid[ost_idx].uuid)) {
+       log_write("index %d UUID is empty(sparse OST index?). Skipping.\n", ost_idx);
+       return(0);
+   }
+
    sprintf(dbname, "%s.%d", MDS_OSTDB, ost_idx);

    VERBOSE(2, "testing ost_idx %d\n", ost_idx);
@@ -1279,11 +1284,20 @@
            ost_hdr->ost_uuid.uuid);

        if (obd_uuid_equals(&lfsck_uuid[ost_idx], &ost_hdr->ost_uuid)) {
+                        /* must be sparse ost index */
            if (ost_hdr->ost_index != ost_idx) {
                log_write("Requested ost_idx %u doesn't match "
                      "index %u found in %s\n", ost_idx,
                      ost_hdr->ost_index, ost_files[i]);
-               continue;
+
+               log_write("Moving the index/uuid to the right place...\n");
+                /* zero the original uuid entry */
+               memset(&lfsck_uuid[ost_idx], 0, sizeof(struct obd_uuid));
+                /* copy it to the right place */
+                ost_idx = ost_hdr->ost_index;
+                strcpy(lfsck_uuid[ost_hdr->ost_index].uuid,ost_hdr->ost_uuid.uuid);
+               /* skip this round */
+               goto out;
            }

            break;


在 2010-11-12,上午10:53, Christopher Walker 写道:

> Thanks very much for your reply. I've tried remaking the mdsdb and all
> of the ostdb's, but I still get the same error -- it checks the first 34
> osts without a problem, but can't find the ostdb file for the 35th
> (which has ost_idx 42):
> 
> ...
> lfsck: ost_idx 34: pass3 OK (676803 files total)
> lfsck: can't find file for ost_idx 35
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 36
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 37
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 38
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 39
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 40
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 41
> Files affected by missing ost info are : -
> lfsck: can't find file for ost_idx 42
> Files affected by missing ost info are : -
> ...
> 
> e2fsck claims to be making the ostdb without a problem:
> 
> Pass 6: Acquiring information for lfsck
> OST: 'aegalfs-OST002a_UUID' ost idx 42: compat 0x2 rocomp 0 incomp 0x2
> OST: num files = 676803
> OST: last_id = 858163
> 
> and with the filesystem up I can see files on this OST:
> 
> [cwalker at iliadaccess04 P-Gadget3.3.1]$ lfs getstripe predict.c
> OBDS:
> 0: aegalfs-OST0000_UUID ACTIVE
> ...
> 33: aegalfs-OST0021_UUID ACTIVE
> 42: aegalfs-OST002a_UUID ACTIVE
> predict.c
> obdidx objid objid group
> 42 10 0xa 0
> 
> 
> lfsck identifies several hundred GB of orphan data that we'd like to
> recover, so we'd really like to run lfsck on this array. We're willing
> to forgo the recovery on the 35th ost, but I want to make sure that
> running lfsck -l with the current configuration won't make things worse.
> 
> Thanks again for your reply; any further advice is very much appreciated!
> 
> Best,
> Chris
> 
> On 11/10/10 12:10 AM, Wang Yibin wrote:
>> The error message indicates that the UUID of OST #35 does not match between the live filesystem and the ostdb file.
>> Is this ostdb obsolete?
>> 
>> 在 2010-11-9,下午11:45, Christopher Walker 写道:
>> 
>>> 
>>> For reasons that I can't recall, our OSTs are not in consecutive order 
>>> -- we have 35 OSTs, which are numbered consecutively from
>>> 0000-0021
>>> and then there's one last OST at
>>> 002a
>>> 
>>> When I try to run lfsck on this array, it works fine for the first 34 
>>> OSTs, but it can't seem to find the last OST db file:
>>> 
>>> lfsck: ost_idx 34: pass3 OK (680045 files total)
>>> lfsck: can't find file for ost_idx 35
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 36
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 37
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 38
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 39
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 40
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 41
>>> Files affected by missing ost info are : -
>>> lfsck: can't find file for ost_idx 42
>>> Files affected by missing ost info are : -
>>> /n/scratch/hernquist_lab/tcox/tests/SbSbhs_e_8/P-Gadget3.3.1/IdlSubfind/.svn/text-base/ReadSubhaloFromReshuffledSnapshot.pro.svn-base
>>> 
>>> and then lists all of the files that live on OST 002a.  This db file 
>>> definitely does exist -- it lives in the same directory as all of the 
>>> other db files, and e2fsck for this OST ran without problems.
>>> 
>>> Is there some way of forcing lfsck to recognize this OST db?  Or, 
>>> failing that, is it dangerous to run lfsck on the first 34 OSTs only?
>>> 
>>> We're using e2fsck 1.41.6.sun1 (30-May-2009)
>>> 
>>> Thanks very much!
>>> 
>>> Chris
>>> 
>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 




More information about the lustre-discuss mailing list