[Lustre-discuss] Object index

Thu Jul 26 16:38:31 PDT 2012

On 2012-07-25, at 3:14, DEGREMONT Aurelien <aurelien.degremont at cea.fr> wrote:
> Le 24/07/2012 20:10, Daniel Kobras a écrit :
>> 
>> Is this the troglodyte type of OST that started its life in times of prehistoric versions of Lustre? We see this on old files that were created in the early ages of Lustre 1.6, before the trusted.fid EA was introduced.
> No, this filesystem was formatted with Lustre 2.0
> By the way, does someone remember the incompatibility with 2.0/2.1 which prevent a target, formatted with Lustre 2.1 to 
> be downgraded to Lustre 2.0 ?

We never allow filesystems formatted with a new version of Lustre to be "downgraded" to an earlier version that the one which it was originally formatted at. This allows us to add new features without somehow having to retroactively. E able to support them in older versions of Lustre. 

>> Other than that, these objects could have been preallocated, but never actually used. Do these objects contain any data at all (blockcount != 0)?
> I was rather thinking of that. But I'm surprised that so many objects are preallocated.

As previously mentioned, they might also be allocated but never accessed. 

>>> -Some of them have good results, and the man page says that
>>> "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the
>>> MDT Object Index (OI) file on the MDT".
>>> How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I'm afraid IAM userspace code
>>> has been deactivated.
>> lfs fid2path (on any client) should do what you're looking for.
> It does not. Moreover, lfs does not support this kind of fid
> [0x20a5df05f:0x4874:0x0]
> [0x20a6e8d8c:0x27b4:0x1]
> 
> Lustre Manual said "The idx field shows the stripe number of this OST object in the Lustre RAID-0 striped file. "

Try setting the last field of the FID (fid_ver) to 0. This is really the LOV stripe index and not really part of the FID at all.  It just happens to be stored in this location on disk to save space. 

> Which seems true as I've got several files where idx > 0.
> But, lfs fid sanity check is :
> static inline int fid_is_sane(const struct lu_fid *fid)
> {
>         return
>                 fid != NULL &&
>                 ((fid_seq(fid) >= FID_SEQ_START && fid_oid(fid) != 0
> && fid_ver(fid) == 0) ||
>                 fid_is_igif(fid));
> }
> 
> And so complains when fid_ver != 0
> 
> I'm not sure at all lfs fid2path expect fid coming from OST.

They are the same FID, it just depends on how you decoded the FID from the OST xattr. 

Cheers, Andreas

>> From my experience, a small amount of object leakage is not too uncommon on real-world systems, so if lfs find doesn't show up any objects anymore, most likely you're good to take this OST down.
> I agree on that, but I consider that more than 4k objects is not "a small amount" :)
> 
>> (Hey, and you can double-check with rbh-report --dump-ost, of course! ;-)
> Sure, but I did not have an rbh DB for that FS available (a pity as "rbh-report" is few minutes in worst cases, "lfs 
> find" was 15 hours :))
> By the way, using a Lustre tool helps me to be sure the remaining objects were not related to a possible robinhood bug :)
> 
> 
> Aurélien
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss