[lustre-discuss] lfs_migrate question

Dilger, Andreas andreas.dilger at intel.com
Wed Feb 17 16:09:18 PST 2016

On 2016/02/17, 12:01, "lustre-discuss on behalf of Ms. Megan Larko" <lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of dobsonunit at gmail.com<mailto:dobsonunit at gmail.com>> wrote:

Greetings to One and All!

I am looking at the lfs_migrate command to move files among OST's in a file system.
In "Lustre Software Release 2.x Operations Manual" Section 33.2.2 Description of lfs_migrate it indicates that "Because lfs_migrate is not closely integrated with the MDS, it cannot determine whether a file is currently open and/or in-use by other applications or nodes.  This makes it UNSAFE (capitalized in Manual) for use on files that might be modified by other applications, since the migrated file is only a copy of the current file.  This results in the old file becoming an open-unlinked file and any modifications to that file are lost."

This entry is somewhat out-of-date.  IIRC, with Lustre 2.5+ the migrate command with the "--block" option will prevent other threads from accessing the file during migration.  Without the "--block" option, if the file is modified before migration completes then the migration will be aborted.  Patches to update the manual would be welcome.

All of the lfs_migrate examples show the command being run on an active/mounted Lustre file system.  Is there any way in which one knows whether a rebalanced/migrated file was in-use at the time of migration (or that it was not in-use at the time of migration)?  On a mounted Lustre FS, is it necessary to make the file system or directories therein read-only for the migration activity?  Would this trait of lfs_migrate being unable to determine whether the file scheduled to be migrated is or is not in-use pose an issue if new OST's are added to the file system and lfs_migrate command is issued (rather than wait for Lustre to re-balance the load over new OSTs by attrition, as it were)?

If "lfs_migrate" reports "falling back to rsync-based migration" then the client/server do not support the atomic layout swap needed to handle migration of open files, and is essentially just doing a copy+rename.  It will still check (unless you disable checksums) whether the file was modified during migration, but this will only detect active writers, and cannot handle open file handles (which may later write to the file).

In most cases this is fine, especially if you limit file migration to older files that are no longer in use, unless you have a workload that opens and modifies existing files.

Cheers, Andreas
Andreas Dilger
Lustre Principal Architect
Intel High Performance Data Division

More information about the lustre-discuss mailing list