[Lustre-discuss] e2scan for cleaning scratch space

Daire Byrne Daire.Byrne at framestore.com
Thu Mar 19 02:29:19 PDT 2009


Thomas,

----- "LEIBOVICI Thomas" <thomas.leibovici at cea.fr> wrote:
> I'm very interested in this discussion, for Lustre-HSM purpose.
> The Lustre-HSM Policy Engine will mostly process ChangeLogs, but an 
> initial scanning may be needed
> for upgrading a non-empty Lustre file system to a Lustre-HSM system.
> Looking at those results, e2scan seems a very efficient way to 
> retrieve metadata for all entries,
> so it could be used for providing an initial list to PolicyEngine, as
> a flat file or DB.
> Does it provide common Posix attributes and striping information?
> I also guess it does not provide file size until 'Size On MDS' feature
> will be landed.

e2scan provides owner, group, ctime and mtime but no atime (I'm sure that 
is a trivial addition). It does not do any reading of the EAs so there is 
no striping information. I would think the overhead of reading the EAs 
would make e2scan far slower. At the moment it is quick and simple.

> > The other big application for scanning the filesystem is "indexing"
> 
> > (which we are
> > always trying to improve). We also use e2scan for this by dumping a
> 
> > sqlite DB
> > and then only stat'ing the new/modified files. Finally we update a 
> > mysql DB which
> > users can quickly query through a GUI. It is always an incremental 
> > scan update to
> > avoid stat'ing unchanged files. We all eagerly await changelogs.....
> >
> Don't you have performance issues with SQLite? It seamed to me that it
> was not very efficient for managing
> huge sets of data with millions of entries.

Admittedly it is not great but is good enough for our purposes. I'm sure
the code could be altered to write directly to mySQL over the network.

Regards,

Daire



More information about the lustre-discuss mailing list