[Lustre-discuss] how the lustre distribute data among disks within one OST

Jaln valiantljk at gmail.com
Thu Jun 13 17:36:00 PDT 2013


Thank you Chris, I'm sort of clear now.
In my question, stripe 0,4 means one process wants to access stripe 0 and 4
at the same time.
there is another process wants to access  both stripe 0 and 2,
even though stripe 0, 2, 4 are in the same place (one file),
but their offsets are different, i.e., 0 and 2 are contiguous, while from 0
to 4 there is a gap.
So my concern is, will the two processes have different I/O cost?
In other words, accessing 0 and 4 would take longer time than accessing 0
and 2.

Jaln

On Thu, Jun 13, 2013 at 5:23 PM, Christopher J. Morrone
<morrone2 at llnl.gov>wrote:

> In that case, it is the question part that I do not understand. :)  What
> is "stripe 0,4", why could it be "closer" then "stripe 0,2"?  In your
> example, 0, 2, and 4 are all in the same place.
>
> If you file is striped over 2 OSTs, then essentially what happens behind
> the scenes is that there are two files, one on each OST.  But Lustre hides
> that from you, as a user.  Lustre basically does modulo operations to
> translate a file offset from the file that it presents to the user, into
> which ost and offset into said ost's file to use.
>
> Does that help at all?
>
> Chris
>
>
> On 06/13/2013 02:58 PM, Jaln wrote:
>
>> Oh, I mean there is one file, for example 6 MB, the stripe size is 1MB,
>> and only 2 OST,
>> then the file will be divided into 6 stripes, denoted as stripe
>> 0,1,2,3,4,5.
>> the distribution on the 2 OST  would be stripe 0,2,4 on OST0, stripe
>> 1,3,5 on OST1.
>>
>> Jaln
>>
>>
>> On Thu, Jun 13, 2013 at 2:54 PM, Christopher J. Morrone
>> <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>> wrote:
>>
>>     I think you may be confused about what a stripe is in Lustre.  If
>>     there are only 2 OST, then you can only stripe a file across 2.
>>
>>     Or maybe I don't understand your terminology.  I don't know what you
>>     mean by "0,4" and "0,2".
>>
>>
>>     On 06/13/2013 02:38 PM, Jaln wrote:
>>
>>         if I have 6 stripes, 2 OST, using round-robin striping,
>>         stripe 0,2,4 will be on OST0,
>>         stripe 1,3,5 will be on OST1,
>>         Do you guys have any idea about what will be the difference of
>>         accessing
>>         stripe 0,4 vs stripe 0,2?
>>         stripe 0, 2 seems to be closer than 0,4, or the lustre will do
>>         some intelligent work?
>>
>>         Jaln
>>
>>
>>         On Thu, Jun 13, 2013 at 10:22 AM, Christopher J. Morrone
>>         <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>> wrote:
>>
>>              On 06/13/2013 05:19 AM, E.S. Rosenberg wrote:
>>               > On Thu, Jun 13, 2013 at 3:09 AM, Christopher J. Morrone
>>               > <morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>
>>         <mailto:morrone2 at llnl.gov <mailto:morrone2 at llnl.gov>>> wrote:
>>               >> Lustre does not  manage the individual disks.  I sits
>>         on top of a
>>               >> filesystem, either ldiskfs(basically ext4) or zfs (as
>>         of Lustre
>>              2.4).
>>               > Is ZFS the recommended fs, or just an option?
>>               > Doesn't ZFS suffer major performance drawbacks on linux
>>         due to it
>>               > living in userspace?
>>               > Thanks,
>>               > Eli
>>
>>              LLNL (Brian Behlendorf) ported ZFS natively to Linux.  We
>>         are not using
>>              the FUSE (userspace) version.  You can find it at:
>>
>>         http://zfsonlinux.org
>>
>>              ZFS is one of the two backend filesystem options for
>>         Lustre, as of
>>              Lustre 2.4.  2.4 is the first Lustre release that fully
>>         supports using
>>              ZFS.  Here at LLNL we are using it on our newest, and
>>         largest at 55PB,
>>              filesystem.
>>
>>              Chris
>>
>>              ______________________________**___________________
>>              Lustre-discuss mailing list
>>         Lustre-discuss at lists.lustre.__**org
>>         <mailto:Lustre-discuss at lists.**lustre.org<Lustre-discuss at lists.lustre.org>
>> >
>>         <mailto:Lustre-discuss at lists._**_lustre.org
>>         <mailto:Lustre-discuss at lists.**lustre.org<Lustre-discuss at lists.lustre.org>
>> >>
>>
>>         http://lists.lustre.org/__**mailman/listinfo/lustre-__**discuss<http://lists.lustre.org/__mailman/listinfo/lustre-__discuss>
>>
>>         <http://lists.lustre.org/**mailman/listinfo/lustre-**discuss<http://lists.lustre.org/mailman/listinfo/lustre-discuss>
>> >
>>
>>
>>
>>
>>         --
>>
>>         Genius only means hard-working all one's life
>>
>>
>>
>>
>>
>> --
>>
>> Genius only means hard-working all one's life
>>
>>
>


-- 

Genius only means hard-working all one's life
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20130613/8bf27434/attachment.htm>


More information about the lustre-discuss mailing list