[Lustre-discuss] non-consecutive OST ordering

Christopher Walker cwalker at fas.harvard.edu
Sat Nov 13 20:21:01 PST 2010


Thanks again for the advice, and for putting this patch together so
quickly. I put your patch into 1.41.6.sun1 and it works perfectly, at
least in read-only mode.

Unfortunately, when I run lfsck -l on this array, it gets to ost_idx 30,
runs through part of the check, but then hangs while checking for orphan
objects:

lfsck: ost_idx 30: pass2 ERROR: 46700 dangling inodes found (682467
files total)
lfsck: ost_idx 30: pass3: check for orphan objects

it hung right around the same time that an LBUG appeared on the MDS:

Nov 13 21:58:39 aegamds1 kernel: LustreError:
21864:0:(lov_pack.c:179:lov_packmd()) ASSERTION(loi->loi_id)
failed:lmm_oid 78132063 stripe 0/1 idx 30
Nov 13 21:58:39 aegamds1 kernel: LustreError:
21864:0:(lov_pack.c:179:lov_packmd()) LBUG

Judging from the number of dangling inodes, there's looks to be
something quite wrong with this OST (although it produces no errors when
run through e2fsck -fn). The last few lines of the lustre-log are:

00000004:00080000:1:1289703502.347853:0:21853:0:(obd.h:1191:obd_transno_commit_cb())
aegalfs-MDT0000: transno 4267034979 committed
02000000:00080000:0:1289703519.748478:0:21877:0:(upcall_cache.c:185:refresh_entry())
aegalfs-MDT0000: invoked upcall /usr/sbin/l_getgroups aegalfs-MDT0000 0
00020000:00040000:4:1289703519.876199:0:21864:0:(lov_pack.c:179:lov_packmd())
ASSERTION(loi->loi_id) failed:lmm_oid 78132063 stripe 0/1 idx 30
00000000:00040000:4:1289703519.876652:0:21864:0:(lov_pack.c:179:lov_packmd())
LBUG
00000400:00000400:4:1289703519.876877:0:21864:0:(linux-debug.c:185:libcfs_debug_dumpstack())
showing stack for process 21864
Debug log: 212116 lines, 212116 kept, 0 dropped.
[root at aegamds1 ~]#

is 78132063 an objid that lfsck doesn't like? Is deleting the associated
file (assuming that I can find it) the best path forward?

Thanks again,
Chris

On 11/12/10 11:02 PM, Wang Yibin wrote:
> For the moment, without investigation, I am not sure about this - There may or may not be compatibility issue. 
> Please checkout the version of the e2fsprogs which is identical with that on your system and patch against the lfsck.c accordingly.
> Then you can compile against 1.6.6.
>
> 在 2010-11-13,上午11:20, Christopher Walker 写道:
>
>> Thanks again for this patch. I just have one quick question about this
>> -- 1.41.12.2.ora1 seems to require lustre_user.h from 1.8.x -- is OK to
>> use a version of lfsck compiled against 1.8.x on a 1.6.6 filesystem, and
>> with {mds,ost}db that were created with 1.41.6?
>>
>> Best,
>> Chris
>>
>> On 11/12/10 3:17 AM, Wang Yibin wrote:
>>> This is a bug in llapi_lov_get_uuids() which assigns UUID to the wrong OST index when there are sparse OST(s).
>>> Please file a bug for this.
>>>
>>> Before this bug can be fixed, you can apply the following patch to e2fsprogs(version 1.41.12.2.ora1) lfsck.c as a workaround (not verified though).
>>>
>>> --- e2fsprogs/e2fsck/lfsck.c   2010-11-12 11:43:42.000000000 +0800
>>> +++ lfsck.c 2010-11-12 12:14:38.000000000 +0800
>>> @@ -1226,6 +1226,12 @@
>>>    __u64 last_id;
>>>    int i, rc;
>>>
>>> +   /* skip empty UUID OST */
>>> +   if(!strlen(lfsck_uuid[ost_idx].uuid)) {
>>> +       log_write("index %d UUID is empty(sparse OST index?). Skipping.\n", ost_idx);
>>> +       return(0);
>>> +   }
>>> +
>>>    sprintf(dbname, "%s.%d", MDS_OSTDB, ost_idx);
>>>
>>>    VERBOSE(2, "testing ost_idx %d\n", ost_idx);
>>> @@ -1279,11 +1284,20 @@
>>>            ost_hdr->ost_uuid.uuid);
>>>
>>>        if (obd_uuid_equals(&lfsck_uuid[ost_idx], &ost_hdr->ost_uuid)) {
>>> +                        /* must be sparse ost index */
>>>            if (ost_hdr->ost_index != ost_idx) {
>>>                log_write("Requested ost_idx %u doesn't match "
>>>                      "index %u found in %s\n", ost_idx,
>>>                      ost_hdr->ost_index, ost_files[i]);
>>> -               continue;
>>> +
>>> +               log_write("Moving the index/uuid to the right place...\n");
>>> +                /* zero the original uuid entry */
>>> +               memset(&lfsck_uuid[ost_idx], 0, sizeof(struct obd_uuid));
>>> +                /* copy it to the right place */
>>> +                ost_idx = ost_hdr->ost_index;
>>> +                strcpy(lfsck_uuid[ost_hdr->ost_index].uuid,ost_hdr->ost_uuid.uuid);
>>> +               /* skip this round */
>>> +               goto out;
>>>            }
>>>
>>>            break;
>>>
>>>
>>> 在 2010-11-12,上午10:53, Christopher Walker 写道:
>>>
>>>> Thanks very much for your reply. I've tried remaking the mdsdb and all
>>>> of the ostdb's, but I still get the same error -- it checks the first 34
>>>> osts without a problem, but can't find the ostdb file for the 35th
>>>> (which has ost_idx 42):
>>>>
>>>> ...
>>>> lfsck: ost_idx 34: pass3 OK (676803 files total)
>>>> lfsck: can't find file for ost_idx 35
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 36
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 37
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 38
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 39
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 40
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 41
>>>> Files affected by missing ost info are : -
>>>> lfsck: can't find file for ost_idx 42
>>>> Files affected by missing ost info are : -
>>>> ...
>>>>
>>>> e2fsck claims to be making the ostdb without a problem:
>>>>
>>>> Pass 6: Acquiring information for lfsck
>>>> OST: 'aegalfs-OST002a_UUID' ost idx 42: compat 0x2 rocomp 0 incomp 0x2
>>>> OST: num files = 676803
>>>> OST: last_id = 858163
>>>>
>>>> and with the filesystem up I can see files on this OST:
>>>>
>>>> [cwalker at iliadaccess04 P-Gadget3.3.1]$ lfs getstripe predict.c
>>>> OBDS:
>>>> 0: aegalfs-OST0000_UUID ACTIVE
>>>> ...
>>>> 33: aegalfs-OST0021_UUID ACTIVE
>>>> 42: aegalfs-OST002a_UUID ACTIVE
>>>> predict.c
>>>> obdidx objid objid group
>>>> 42 10 0xa 0
>>>>
>>>>
>>>> lfsck identifies several hundred GB of orphan data that we'd like to
>>>> recover, so we'd really like to run lfsck on this array. We're willing
>>>> to forgo the recovery on the 35th ost, but I want to make sure that
>>>> running lfsck -l with the current configuration won't make things worse.
>>>>
>>>> Thanks again for your reply; any further advice is very much appreciated!
>>>>
>>>> Best,
>>>> Chris
>>>>
>>>> On 11/10/10 12:10 AM, Wang Yibin wrote:
>>>>> The error message indicates that the UUID of OST #35 does not match between the live filesystem and the ostdb file.
>>>>> Is this ostdb obsolete?
>>>>>
>>>>> 在 2010-11-9,下午11:45, Christopher Walker 写道:
>>>>>
>>>>>> For reasons that I can't recall, our OSTs are not in consecutive order 
>>>>>> -- we have 35 OSTs, which are numbered consecutively from
>>>>>> 0000-0021
>>>>>> and then there's one last OST at
>>>>>> 002a
>>>>>>
>>>>>> When I try to run lfsck on this array, it works fine for the first 34 
>>>>>> OSTs, but it can't seem to find the last OST db file:
>>>>>>
>>>>>> lfsck: ost_idx 34: pass3 OK (680045 files total)
>>>>>> lfsck: can't find file for ost_idx 35
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 36
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 37
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 38
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 39
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 40
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 41
>>>>>> Files affected by missing ost info are : -
>>>>>> lfsck: can't find file for ost_idx 42
>>>>>> Files affected by missing ost info are : -
>>>>>> /n/scratch/hernquist_lab/tcox/tests/SbSbhs_e_8/P-Gadget3.3.1/IdlSubfind/.svn/text-base/ReadSubhaloFromReshuffledSnapshot.pro.svn-base
>>>>>>
>>>>>> and then lists all of the files that live on OST 002a.  This db file 
>>>>>> definitely does exist -- it lives in the same directory as all of the 
>>>>>> other db files, and e2fsck for this OST ran without problems.
>>>>>>
>>>>>> Is there some way of forcing lfsck to recognize this OST db?  Or, 
>>>>>> failing that, is it dangerous to run lfsck on the first 34 OSTs only?
>>>>>>
>>>>>> We're using e2fsck 1.41.6.sun1 (30-May-2009)
>>>>>>
>>>>>> Thanks very much!
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list