[lustre-discuss] find xdev?

Andreas Dilger adilger at whamcloud.com
Wed Sep 11 12:16:10 PDT 2019

On Sep 11, 2019, at 10:06, Michael Di Domenico <mdidomenico4 at gmail.com<mailto:mdidomenico4 at gmail.com>> wrote:

On Tue, Sep 10, 2019 at 5:48 PM Andreas Dilger <adilger at whamcloud.com<mailto:adilger at whamcloud.com>> wrote:

I don't think "lfs find -xdev" has never been a priority for Lustre, since it is rare for Lustre filesystems to be
mounted in a nested manner.  Since people already run multiple "lfs find" tasks in parallel on different
clients to get better performance, it isn't hard to run separate tasks from the top-level mountpoint of
different filesystems.  What is the use case for this?

doesn't xdev keep find from crossing mount points, not necessarily
only in a nested manner but also if there's a link to a directory in a
different filesystem.  i believe 'find' without -xdev will follow and
descend.  but this predicates that my understanding is sound (which it
probably isn't).  i generally add -xdev to my finds as a habit to keep
from scanning nfs volumes.

Yes, -xdev is to avoid crossing mountpoints, but like I wrote it is rare to have nested Lustre
mountpoints, so this wouldn't really be useful in most cases.  The find tree walking does
*not* follow symlinks into the target directory, only mountpoints.

along the same vein, can anyone state whether there's any actual
performance gain walking the filesystem using find vs lfs find?

For "find" vs. "lfs find" performance, this depends heavily on what the search parameters are.  If just
the filename, they will be the same.  If it includes some MDT-specific attributes (e.g. uid, gid) then
"lfs find" can be significantly faster (e.g 3-5x).  If it is uses file size, then they will be about the same
unless there are other MDT-only parameters, or once LSOM support is landed (hopefully 2.13).

okay, that's what i thought or recalled correctly from hearing
somewhere else.  in my particular instance i was just using 'find
-type f' and didn't see any appreciable difference in scanning speed
between the two

For Lustre, ext4, and most other filesystems, the file type is also stored in the directory entry, so that
"find" can determine the type without a "stat".  That is safe since the file type cannot be changed after
the file is created.

In a scan like "(lfs) find -name '*foo*' -type f" it only needs to read the directory entries (including the
file type) and process each entry.  There is nothing that "lfs find" can optimize.  With mode, uid, gid, and
*some* timestamp queries, "lfs find" can fetch only the MDS attributes and skip any OST RPCs for
that file if the (non)match can be decided without the OST attributes.

Once the Lazy Size-on-MDT (LSOM) integration is finished (https://review.whamcloud.com/35167) it
will be possible for "lfs find --lazy" to use *only* attributes from the MDS (size, timestamps) to speed
up scanning and avoid OST RPC overhead.

Cheers, Andreas
Andreas Dilger
Principal Lustre Architect

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190911/9b9f10b5/attachment.html>

More information about the lustre-discuss mailing list