[Lustre-discuss] non-consecutive OST ordering

Christopher Walker cwalker at fas.harvard.edu
Fri Nov 12 06:48:00 PST 2010


Thanks Andreas.  The orphan data is scattered throughout the array, 
although it's primarily on one OST (30) which seems to have been hit 
particularly hard by this outage:

[root at iliadaccess04 lfsck2]# grep ERROR lfsck2.out
lfsck: ost_idx 5: pass2 ERROR: 3817 dangling inodes found (654297 files 
total)
lfsck: ost_idx 6: pass2 ERROR: 2 dangling inodes found (670416 files total)
lfsck: ost_idx 11: pass2 ERROR: 942 dangling inodes found (673425 files 
total)
lfsck: ost_idx 12: pass2 ERROR: 64 dangling inodes found (678878 files 
total)
[lfsck: ost_idx 13: pass3 ERROR: 24.2109MB of orphan data (6 of 776532 
files total)
lfsck: ost_idx 14: pass3 ERROR: 5.34375MB of orphan data (2 of 725085 
files total)
lfsck: ost_idx 15: pass2 ERROR: 1 dangling inodes found (672942 files total)
lfsck: ost_idx 15: pass3 ERROR: 58.4688MB of orphan data (6 of 739995 
files total)
lfsck: ost_idx 16: pass2 ERROR: 1 dangling inodes found (671379 files total)
lfsck: ost_idx 18: pass2 ERROR: 3371 dangling inodes found (692018 files 
total)
lfsck: ost_idx 18: pass3 ERROR: 5499.86MB of orphan data (620 of 688965 
files total)
lfsck: ost_idx 19: pass3 ERROR: 21.375MB of orphan data (8 of 775964 
files total)
[20] lfsck: ost_idx 20: pass3 ERROR: 3433.61MB of orphan data (16 of 
843328 files total)
[22] zero-length orphan objilfsck: ost_idx 22: pass3 ERROR: 1.21094MB of 
orphan data (16 of 859527 files total)
lfsck: ost_idx 23: pass2 ERROR: 1 dangling inodes found (663492 files total)
[23] zero-length orphan oblfsck: ost_idx 23: pass3 ERROR: 8571.68MB of 
orphan data (20 of 838490 files total)
[24] zero-length orphan objid 83735lfsck: ost_idx 24: pass3 ERROR: 
4367.45MB of orphan data (16 of 837371 files total)
[25] zero-length orphan objid lfsck: ost_idx 25: pass3 ERROR: 121.996MB 
of orphan data (16 of 858679 files total)
lfsck: ost_idx 30: pass2 ERROR: 46700 dangling inodes found (682467 
files total)
lfsck: ost_idx 30: pass3 ERROR: 45313.4MB of orphan data (7648 of 668343 
files total)
[root at iliadaccess04 lfsck2]#

Thanks again,
Chris

On 11/12/10 3:05 AM, Andreas Dilger wrote:
> On 2010-11-11, at 19:53, Christopher Walker wrote:
>> Thanks very much for your reply. I've tried remaking the mdsdb and all
>> of the ostdb's, but I still get the same error -- it checks the first 34
>> osts without a problem, but can't find the ostdb file for the 35th
>> (which has ost_idx 42):
>>
>> with the filesystem up I can see files on this OST:
>>
>> [cwalker at iliadaccess04 P-Gadget3.3.1]$ lfs getstripe predict.c
>> OBDS:
>> 0: aegalfs-OST0000_UUID ACTIVE
>> ...
>> 33: aegalfs-OST0021_UUID ACTIVE
>> 42: aegalfs-OST002a_UUID ACTIVE
>> predict.c
>> obdidx objid objid group
>> 42 10 0xa 0
>>
>>
>> lfsck identifies several hundred GB of orphan data that we'd like to
>> recover, so we'd really like to run lfsck on this array. We're willing
>> to forgo the recovery on the 35th ost, but I want to make sure that
>> running lfsck -l with the current configuration won't make things worse.
> I'm not sure that what lfsck is reporting in this case is correct.  Is the orphan data all on the same OST, or spread around separate OSTs?  My concern is that if lfsck thinks the in-use objects on your estranged OST is actually orphan data it will destroy that data.
>
> If there are a small number of very large objects on other OSTs that are making up the bulk of the orphan space usage, you could mount those OSTs as type ldiskfs and delete the objects by hand to free up the space.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>




More information about the lustre-discuss mailing list