[Lustre-discuss] "obdidx" ordering in "lfs getstripe"

Andreas Dilger adilger at whamcloud.com
Thu Feb 9 06:48:48 PST 2012


On 2012-02-09, at 6:20 AM, Jack David wrote:
> In the output of "lsf getstripe <filename> | <dirname>", the obdidx
> denotes the OST index (I assume).
> 
> Consider the following output:
> 
> lmm_stripe_count:   2
> lmm_stripe_size:    1048576
> lmm_stripe_offset:  1
> 	obdidx		 objid		objid		 group
> 	     1	             2	          0x2	             0
> 	     0	             3	          0x3	             0
> 
> where I have a setup consisting of two OSTs. If I have more than two
> OSTs, is it possible that I get the obdidx values out of order? Or the
> obdidx values will always be linear?
> 
> For example, in above output, the values are linear (like 1, 0 - and
> this pattern will be repeated while storing the data I assume). If I
> have 4 OSTs, can the values be non-linear? Something like 2,0,1,3 or
> 2,1,3,0 (or any pattern for that matter)??

Typically the ordering will be linear, but this depends on a number of
different factors:
- what order the OSTs were created in:  without --index=N the OST order
  depends on the order in which they were first mounted, so using --index
  is always recommended, and will be mandatory in the future
- the distribution of OSTs among OSS nodes:  the MDS object allocator
  will normally select one OST from each OSS before allocating another
  object from a different OST on the same OSS
- the space available on each OST:  when OST free space is imbalanced
  the OSTs will be selected in part based on how full they are

> My assumption on how the data is stored on OSTs:
> Based upon the values of obdidx, each OST will store a stripe_size
> worth data into the objid (a file under ldiskfs volume of that OST) in
> rotation. So if I get the obdidx like 2,1,3,0 and stripe_size if 1MB,
> then the data will be stored in following order:
> 
> 1st MB: 2nd OST
> 2nd MB: 1st OST
> 3rdMB: 3rd OST
> 4thMB: 0th OST
> 5th MB: 2nd OST (Again - repeating the pattern)
> 6th MB: 1st OST
> 
> Is this understanding correct?? I hope I am clear on my question.

Correct.  The data is strictly round-robin on the objects once they
are allocated to a file.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Engineer                   http://www.whamcloud.com/







More information about the lustre-discuss mailing list