[lustre-discuss] Removing large directory tree

Bob Ball ball at umich.edu
Fri Jul 10 18:16:49 PDT 2015


FWIW, last September in the context of reducing the memory usage on the 
mgs, Andreas Dilger had this to say

"ls" is itself fairly inefficient at directory traversal, because all of
the GNU file utilities are bloated and do much more work than necessary
(e.g. "rm" will stat() every file before unlinking it).  Using something
like:

     find $dir -type f -print0 | xargs -0 munlink
     find $dir -type d -print0 | xargs rmdir

will probably be most efficient.  "munlink" is a Lustre-specific tool that
just unlinks the filenames passed without stat() on each one.  If that is
not available, you can also use "xargs -0 -n 1 unlink" but it will fork a
new unlink process for every file which also adds some overhead.

bob

On 7/10/2015 7:24 PM, Sreedhar wrote:
> Hi,
>
> Just like Malcolm suggested, before I implemented Robinhood, I built 
> the list and removed files (that were not accessed in more than 31 
> days). I found it much faster than all other methods.
>
> /usr/bin/lfs find /scratch/${folder} -A +31 -p -t f | xargs -0 -n 10 
> -P 8 /bin/rm
>
> 10 and 8 for flags n and p worked well for me on our system.
>
> Sreedhar.
> New York University.
>
>
> On Friday, July 10, 2015, Cowe, Malcolm J <malcolm.j.cowe at intel.com 
> <javascript:_e(%7B%7D,'cvml','malcolm.j.cowe at intel.com');>> wrote:
>
>     There’s something in the rm command that makes recursive deletes
>     really expensive, although I don’t know why. I’ve found in the
>     past that even running a regular find ... –exec rm {} \; has been
>     quicker. Running lfs find to build the file list would presumably
>     be quicker still.
>
>     Malcolm.
>
>     *From:*lustre-discuss
>     [mailto:lustre-discuss-bounces at lists.lustre.org] *On Behalf Of
>     *Andrus, Brian Contractor
>     *Sent:* Saturday, July 11, 2015 8:05 AM
>     *To:* lustre-discuss at lists.lustre.org
>     *Subject:* [lustre-discuss] Removing large directory tree
>
>     All,
>
>     I understand that doing recursive file operations can be taxing on
>     lustre.
>
>     So, I wonder if there is a preferred performance-minded way to
>     remove an entire directory tree that is several TB in size.
>
>     The standard rm -rf ./dir seems to spike the cpu usage on my OSSes
>     where it sits and sometimes causes clients to be evicted.
>
>     Brian Andrus
>
>     ITACS/Research Computing
>
>     Naval Postgraduate School
>
>     Monterey, California
>
>     voice: 831-656-6238
>
>
>     _______________________________________________
>     lustre-discuss mailing list
>     lustre-discuss at lists.lustre.org
>     http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> -- 
> Sreedhar Manchu
> HPC Systems Administrator
> eSystems & Research Services
> New York University, New York 10012
> *http://unixoperator.blogspot.com*
> *https://wikis.nyu.edu/display/~sm4082 
> <https://wikis.nyu.edu/display/%7Esm4082>*
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20150710/ceff1747/attachment-0001.htm>


More information about the lustre-discuss mailing list