[lustre-discuss] Lustre optimize for spares data files ?

Sat Sep 12 10:52:34 PDT 2020

Dear Andreas,

Sorry for replying late. In the past days we fell into a nightmare
of upgrading a large Lustre system from version 1.8.8 directly to
version 2.10.7. Fortunately everything seems done.

Our scenario of storing large sparse data file is simple. We have
standard C code for some computation. After computation, it produced
a large sparse matrix around 3GB with more than 90% of zeros. It just
uses fwrite() to write the whole matrix into a binary file.

We send this code to run in many different clusters. We expect that
every generated data file should be size 3GB. But occationally we
found in one cluster (which was developed by other lab, not our group)
the actual occupied size of each file is only less than 10%. We use

	du -s <filename>

to see this phenomena. But we don't know what's the backend file
system that cluster use, because we only see that the storage is
mounted as nfs4.

Note that our C code only use fwrite() write the whole large sparse
matrix. There is no trick played in our code. So the "compression"
of the large sparse file must be done by the backend file system
itself. So I am just curious whether Lustre has this implementation
or not.

It seems that Lustre with ZFS backend can do this, because the file
compression is actually done by ZFS. Hence here comes another question:
If we have a lot of OSTs, some with ZFS, and some with ldiskfs, is it
possible to enable file compression on the ZFS backend only without
any side effect in the whole Lustre file system ?

Any comments are very welcome. Thanks in advance.

Best Regards,

T.H.Hsieh

On Wed, Sep 09, 2020 at 01:45:10PM -0600, Andreas Dilger wrote:
> On Sep 8, 2020, at 9:13 PM, Tung-Han Hsieh <thhsieh at twcp1.phys.ntu.edu.tw> wrote:
> > 
> > I would like to ask whether Lustre file system has implemented the
> > function to optimize for large sparse data files ?
> > 
> > For example, a 3GB data file but with more than 80% bytes zero, can
> > Lustre file system optimize the storage not actually taking the whole
> > 3GB of disk space ?
> 
> Could you please explain your usage further?  Lustre definitely has
> support for sparse files - if they are written by an application with
> "seek" or by multiple threads in parallel, then only the blocks that
> are written will use space on the OST.
> 
> For ldiskfs the block size is 4KB.  For ZFS the OST block size is up
> to 1MB, if the file size is 1MB or larger.  That is why compression
> on ZFS can help reduce space usage on the OST, because it can effectively
> compress the 1MB blocks that are nearly full of zeroes, if your sparse
> writes are smaller than the blocksize.
> 
> If you are *copying* a sparse file, that depends on the tool that is
> doing the copy.  For example, "cp --sparse=always" will generate a
> sparse file.  We are also working on adding SEEK_HOLE and SEEK_DATA,
> which will help tools to copy sparse files.
> 
> Cheers, Andreas
> 
> > I know that some file systems (e.g., ZFS) has this function. If Lustre
> > does not have it, is there a roadmap to implement it in the future ?
> > 
> > Thanks for your reply in advance.
> > 
> > Best Regards,
> > 
> > T.H.Hsieh
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
>