[Lustre-discuss] 16T LUNs

David Simas dgs at slac.stanford.edu
Wed Feb 10 16:29:04 PST 2010


On Wed, Feb 10, 2010 at 02:41:55PM -0700, Andreas Dilger wrote:
> On 2010-02-10, at 07:39, Roger Spellman wrote:
> > Thank you.  Based on the kernel version string, we had assumed that  
> > SLES
> > was closer to the latest kernel.org release than RHEL.  That appears  
> > not
> > to be the case.
> >
> > Just curious, why the limit is now 16T?  This works nicely for 2T  
> > drives
> > in an 8+2 RAID 6.  But, is there a reason that the limit couldn't be
> > much higher, say 64T or 256T?
> 
> Two reasons for this:
> - primarily, the upstream e2fsprogs does not yet have full support for  
>  >16TB
>    filesystems, and while experimental patches exist there are still  
> bugs
>    being found occasionally in that code
> - there is a certain amount of testing that we need to do before we  
> can say
>    that Lustre supports that configuration

We are interested in testing OSTs larger than 16 TB.  I found a
public repository for 64-bit e2fsprogs at

	git://git.kernel.org/pub/scm/fs/ext2/val/e2fsprogs.git

Is this were I'd find the patches you refer to?  Would I apply
them against the 1.8.2 distribution's e2fsprogs sources?

David Simas
SLAC

> 
> That said, with 1.8.2 it is still possible to format the filesystem  
> with the experimental 64-bit e2fsprogs, and mount the OSTs with "-o  
> force_over_16tb" and test this out yourselves.  Feedback is of course  
> welcome.  I would suggest running "llverfs" on the mounted Lustre  
> filesystem (or other tool which can post-facto verify the data being  
> written) to completely fill an OST, probably unmount/remount it to  
> clear any cache, and then read the data back and ensure that you are  
> getting the correct data back.  Running an "e2fsck -f" on the OST  
> would also help verify the on-disk filesystem structure.
> 
> At some point we will likely conduct this same testing and  
> "officially" support this configuration, but it wasn't done for  
> 1.8.2.  At some point, the e2fsck overhead of a large ext4/ldiskfs  
> filesystem becomes too high to support huge configurations (e.g.  
> larger than, say, 128TB if even that).  While ext4 and e2fsprogs have  
> gotten a lot of improvements to speed up e2fsck time, there is a limit  
> to what can be done with this.
> 
> >> -----Original Message-----
> >> From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On  
> >> Behalf
> > Of
> >> Andreas Dilger
> >> Sent: Tuesday, February 09, 2010 7:13 PM
> >> To: Roger Spellman
> >> Cc: lustre-discuss at lists.lustre.org
> >> Subject: Re: [Lustre-discuss] 16T LUNs
> >>
> >> On 2010-02-09, at 15:02, Roger Spellman wrote:
> >>> I see that 1.8.2 supports 16T OSTs for RHEL.
> >>>
> >>> Does anyone know when this will be supported for SLES?
> >>
> >> No, it will not, because SLES doesn't provide a very uptodate ext4
> >> code, and a number of 16TB fixes went into ext4 late in the game.
> >> RHEL5.4, on the other hand, has very uptodate ext4 code and the RHEL
> >> ext4 maintainer is one of the ext4 maintainers himself.
> >>
> >>> Is anyone currently using a 16T OST, who could share their
> >>> experiences?  Is it stable?
> >>
> >>
> >> I believe a few large customers are already testing/using this.  I'll
> >> let them speak for themselves.
> >>
> >> Cheers, Andreas
> >> --
> >> Andreas Dilger
> >> Sr. Staff Engineer, Lustre Group
> >> Sun Microsystems of Canada, Inc.
> >
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list