[Lustre-discuss] how the lustre distribute data among disks within one OST

Christopher J. Morrone morrone2 at llnl.gov
Thu Jun 13 18:01:49 PDT 2013


Well, that is really more of a question for the backend filesystem in 
that case.  From Lustre's perspective there is very little difference, 
especially since our RPC size is capped at 1MB currently (although that 
may change in future versions).  It would probably make more difference 
to the backend filesystem and storage devices than to Lustre itself.

But, of course, the devil is always in the details.

Chris

On 06/13/2013 05:36 PM, Jaln wrote:
> Thank you Chris, I'm sort of clear now.
> In my question, stripe 0,4 means one process wants to access stripe 0
> and 4 at the same time.
> there is another process wants to access  both stripe 0 and 2,
> even though stripe 0, 2, 4 are in the same place (one file),
> but their offsets are different, i.e., 0 and 2 are contiguous, while
> from 0 to 4 there is a gap.
> So my concern is, will the two processes have different I/O cost?
> In other words, accessing 0 and 4 would take longer time than accessing
> 0 and 2.
>
> Jaln
>
> On Thu, Jun 13, 2013 at 5:23 PM, Christopher J. Morrone
> <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>> wrote:
>
>     In that case, it is the question part that I do not understand. :)
>       What is "stripe 0,4", why could it be "closer" then "stripe 0,2"?
>       In your example, 0, 2, and 4 are all in the same place.
>
>     If you file is striped over 2 OSTs, then essentially what happens
>     behind the scenes is that there are two files, one on each OST.  But
>     Lustre hides that from you, as a user.  Lustre basically does modulo
>     operations to translate a file offset from the file that it presents
>     to the user, into which ost and offset into said ost's file to use.
>
>     Does that help at all?
>
>     Chris
>
>
>     On 06/13/2013 02:58 PM, Jaln wrote:
>
>         Oh, I mean there is one file, for example 6 MB, the stripe size
>         is 1MB,
>         and only 2 OST,
>         then the file will be divided into 6 stripes, denoted as stripe
>         0,1,2,3,4,5.
>         the distribution on the 2 OST  would be stripe 0,2,4 on OST0, stripe
>         1,3,5 on OST1.
>
>         Jaln
>
>
>         On Thu, Jun 13, 2013 at 2:54 PM, Christopher J. Morrone
>         <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>> wrote:
>
>              I think you may be confused about what a stripe is in
>         Lustre.  If
>              there are only 2 OST, then you can only stripe a file across 2.
>
>              Or maybe I don't understand your terminology.  I don't know
>         what you
>              mean by "0,4" and "0,2".
>
>
>              On 06/13/2013 02:38 PM, Jaln wrote:
>
>                  if I have 6 stripes, 2 OST, using round-robin striping,
>                  stripe 0,2,4 will be on OST0,
>                  stripe 1,3,5 will be on OST1,
>                  Do you guys have any idea about what will be the
>         difference of
>                  accessing
>                  stripe 0,4 vs stripe 0,2?
>                  stripe 0, 2 seems to be closer than 0,4, or the lustre
>         will do
>                  some intelligent work?
>
>                  Jaln
>
>
>                  On Thu, Jun 13, 2013 at 10:22 AM, Christopher J. Morrone
>                  <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>
>                  <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>>> wrote:
>
>                       On 06/13/2013 05:19 AM, E.S. Rosenberg wrote:
>                        > On Thu, Jun 13, 2013 at 3:09 AM, Christopher J.
>         Morrone
>                        > <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>
>                  <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>>> wrote:
>                        >> Lustre does not  manage the individual disks.
>           I sits
>                  on top of a
>                        >> filesystem, either ldiskfs(basically ext4) or
>         zfs (as
>                  of Lustre
>                       2.4).
>                        > Is ZFS the recommended fs, or just an option?
>                        > Doesn't ZFS suffer major performance drawbacks
>         on linux
>                  due to it
>                        > living in userspace?
>                        > Thanks,
>                        > Eli
>
>                       LLNL (Brian Behlendorf) ported ZFS natively to
>         Linux.  We
>                  are not using
>                       the FUSE (userspace) version.  You can find it at:
>
>         http://zfsonlinux.org
>
>                       ZFS is one of the two backend filesystem options for
>                  Lustre, as of
>                       Lustre 2.4.  2.4 is the first Lustre release that
>         fully
>                  supports using
>                       ZFS.  Here at LLNL we are using it on our newest, and
>                  largest at 55PB,
>                       filesystem.
>
>                       Chris
>
>                       ___________________________________________________
>                       Lustre-discuss mailing list
>                  Lustre-discuss at lists.lustre.____org
>                  <mailto:Lustre-discuss at lists.__lustre.org
>         <mailto:Lustre-discuss at lists.lustre.org>>
>                  <mailto:Lustre-discuss at lists.
>         <mailto:Lustre-discuss at lists.>____lustre.org <http://lustre.org>
>                  <mailto:Lustre-discuss at lists.__lustre.org
>         <mailto:Lustre-discuss at lists.lustre.org>>>
>
>         http://lists.lustre.org/____mailman/listinfo/lustre-____discuss
>         <http://lists.lustre.org/__mailman/listinfo/lustre-__discuss>
>
>
>         <http://lists.lustre.org/__mailman/listinfo/lustre-__discuss
>         <http://lists.lustre.org/mailman/listinfo/lustre-discuss>>
>
>
>
>
>                  --
>
>                  Genius only means hard-working all one's life
>
>
>
>
>
>         --
>
>         Genius only means hard-working all one's life
>
>
>
>
>
> --
>
> Genius only means hard-working all one's life
>




More information about the lustre-discuss mailing list