[Lustre-devel] storing SOM epoch in EA

Kalpak Shah Kalpak.Shah at Sun.COM
Tue Feb 19 07:11:31 PST 2008


On Tue, 2008-02-19 at 17:59 +0300, Mikhail Pershin wrote:
> On Tue, 19 Feb 2008 15:02:02 +0300, Yuriy Umanets <Yury.Umanets at Sun.COM>  
> wrote:
> 
> > Alex Zhuravlev wrote:
> >> Yuriy Umanets wrote:
> >>
> >>> EA is separate block is evil. It makes things slow.
> >>>
> >>
> >> we have fast EAs (stored in inode, this is why we make them large) for  
> >> years.
> >>
> > Well, people used horses for ages but this did not stop them from
> > building cars :) Guys, I gave you idea, not worse than using EAs. I will
> > not insist it is great. If you can't estimate its value yourself, well,
> > let it be. We have such a nice thing as IAM and you keep talking about
> > EAs...
> >
> > Seriously, IMHO what is bad about EAs:
> >
> > 1. You need to control their size, you need to bother;
> > 2. Large-fast inodes make create/lookup slow. You need to load this
> > thing to memory after all. I think this is complement to additional
> > seeks caused by IAM;
> 
> but this is still better than extra block for EA or IAM. Btw IAM data is  
> also in memory and takes it no less than extra inode size possibly
> 
> > 3. Storing epoch in EA makes you use this chain to access epoch:
> > fid->inode->epoch (in EA), IAM makes it shorter: fid->epoch (in IAM);
> 
> not true actually. inode will be read anyway until you are proposing to  
> put whole inode body in IAM, so there is no benefits. Moreover inode->ea  
> is direct mapping while fid->epoch will need index lookup and may invoke  
> several blocks to read if IAM is large and it will be large in this case,  
> so IO will be not better than even EA in extra block.
> 
> > 4. Large inodes consume more RAM;
> 
> this is the same as 2.
> 
> Guys, don't forget about DMU as well.

For the DMU, we will be using 1024-byte dnodes by default to store the
striping information. So the epoch can be stored in the in-dnode system
attributes. The epoch will need to be stored in an external block or
FatZap (depending on implementation of in-dnode EAs) only in-case the
file is striped across more than 10-15 OSTs. (The exact number of
striped will again depend on the design of in-dnode EAs)

Thanks,
Kalpak.

> 




More information about the lustre-devel mailing list