[lustre-discuss] Why certain commands should not be used on Lustre file system

Cowe, Malcolm J malcolm.j.cowe at intel.com
Wed Feb 10 01:26:25 PST 2016


Recursive delete with rm -r is generally the slowest way to clear out a directory tree (irrespective of file system). I've run tests where even "find <path> -depth -delete" will complete more quickly than "rm -rf <path>". There's also an rsync hack that some people like, and there's a funky perl option:

perl -e 'for(<*>){((stat)[9]<(unlink))}'

which I dunno, seems like it is trying too hard. Found it on stackoverflow, I think, so I'm not sure I quite trust it.

Stu's "find ... | xargs ... rm -f" looks like a winner though.
 
Malcolm.

-----Original Message-----
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Stu Midgley
Sent: Wednesday, February 10, 2016 6:50 PM
To: Prakrati.Agrawal at shell.com
Cc: lustrefs
Subject: Re: [lustre-discuss] Why certain commands should not be used on Lustre file system

We actually use

    find <dir> -type f -print0 | xargs -n 100 -P 32 -0 -- rm -f

which will parallelise the rm... which runs a fair bit faster.


On Wed, Feb 10, 2016 at 3:33 PM,  <Prakrati.Agrawal at shell.com> wrote:
> Hi,
>
> Then rm -rf * should not be used in any kind of file system. Why only Lustre file system' best practices have this as a pointer.
>
> Thanks and Regards,
> Prakrati
>
> -----Original Message-----
> From: Dilger, Andreas [mailto:andreas.dilger at intel.com]
> Sent: Wednesday, February 10, 2016 11:22 AM
> To: Agrawal, Prakrati PTIN-PTT/ICOE; lustre-discuss at lists.lustre.org
> Subject: Re: [lustre-discuss] Why certain commands should not be used on Lustre file system
>
> On 2016/02/09, 21:16, "lustre-discuss on behalf of Prakrati.Agrawal at shell.com<mailto:Prakrati.Agrawal at shell.com>" <lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of Prakrati.Agrawal at shell.com<mailto:Prakrati.Agrawal at shell.com>> wrote:
>
> I read on Lustre best practices that ls -U should be used instead of ls -l . I understand that ls -l makes MDS contact all OSS to get all information about all files and hence loads it. But, what does ls -U do to avoid it?
>
>        -U     do not sort; list entries in directory order
>
> This is more important for very large directories, since "ls" will read all of the entries and stat them before printing anything.  That said, GNU ls will still read all of the entries before printing them, so for very large directories "find <directory> -ls" is a lot faster to start printing entries.
>
> Also, it is said that rm-rf * should not be used. Please can someone explain the reason for that.
>
> It is also said that instead lfs find  <directory path> --type f -print0 | xargs -0 rm -f should be used. Please explain the reason for this also.
>
> "rm -rf *" will expand "*" onto the command line (done by bash) and if there are too many files in the directory (more than about 8MB IIRC) then bash will fail to execute the command.  Running "lfs find" (or just plain "find") will only print the filenames onto the output and xargs will process them in chunks that fit onto a command-line.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel High Performance Data Division
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



-- 
Dr Stuart Midgley
sdm900 at sdm900.com
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list