[Lustre-discuss] Optimal stratgy for OST distribution

Wojciech Turek wjt27 at cam.ac.uk
Thu Mar 31 08:04:07 PDT 2011


I agree with Michael, keep it simple so it won't become unmanageable when
you grow your system to ten's or hundred's of OSTs.
>From Lustre point of vew it does not matter which OSS mounts which OST as
long as the distribution of the OST's across OSS's is evenly balanced.
Lustre objects are placed on the OSTs using load balancing algorithm which
is based on OSTs available space.
You can change that default behaviour using OST pools.

Cheers

Wojciech

On 31 March 2011 15:54, Michael Barnes <Michael.Barnes at jlab.org> wrote:

>
> Frank,
>
> File striping and allocation are essentially randomized across OSTs
> so from lustre's point of view there is no difference between between
> a and b.  AFAIK, Lustre does try to do some balancing based on available
> space and possibly other simple heuristics, but the ordering of the OSTs
> does not affect this decision making process.
>
> >From a management point of view, b is much simpler to manage, and in the
> case that you add more storage to your system, you just keep adding the
> OSTs in sequence.
>
> -mb
>
> On Mar 31, 2011, at 10:06 AM, Heckes, Frank wrote:
>
> > Hi all,
> >
> > sorry if this question has been answered before.
> >
> > What is the optimal 'strategy' assigning OSTs to OSS nodes:
> >
> > -a- Assign OST via round-robin to the OSS
> > -b- Assign in consecutive order (as long as the backend storage provides
> >    enought capacity for iops and bandwidth)
> > -c- Something 'in-between' the 'extremes' of -a- and -b-
> >
> > E.g.:
> >
> > -a-     OSS_1           OSS_2           OST_3
> >          |_              |_              |_
> >            OST_1           OST_2           OST_3
> >            OST_4           OST_5           OST_6
> >            OST_7           OST_8           OST_9
> >
> > -b-     OSS_1           OSS_2           OST_3
> >          |_              |_              |_
> >            OST_1           OST_4           OST_7
> >            OST_2           OST_5           OST_8
> >            OST_3           OST_6           OST_9
> >
> > I thought -a- would be best for task-local (each task write to own
> > file) and single file (all task write to single file) I/O since its like
> > a raid-0 approach used disk I/O (and SUN create our first FS this way).
> > Does someone made any systematic investigations which approach is best
> > or have some educated opinion?
> > Many thanks in advance.
> > BR
> >
> > -Frank Heckes
> >
> >
> ------------------------------------------------------------------------------------------------
> >
> ------------------------------------------------------------------------------------------------
> > Forschungszentrum Juelich GmbH
> > 52425 Juelich
> > Sitz der Gesellschaft: Juelich
> > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> > Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
> > Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> > Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> > Prof. Dr. Sebastian M. Schmidt
> >
> ------------------------------------------------------------------------------------------------
> >
> ------------------------------------------------------------------------------------------------
> >
> > Besuchen Sie uns auf unserem neuen Webauftritt unter www.fz-juelich.de
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> --
> +-----------------------------------------------
> | Michael Barnes
> |
> | Thomas Jefferson National Accelerator Facility
> | Scientific Computing Group
> | 12000 Jefferson Ave.
> | Newport News, VA 23606
> | (757) 269-7634
> +-----------------------------------------------
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>



-- 
Wojciech Turek

Senior System Architect

High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110331/f9c0ed29/attachment.htm>


More information about the lustre-discuss mailing list