[lustre-discuss] lfs_migrate rsync vs. lfs migrate and layout swap

Brian Andrus toomuchit at gmail.com
Sun Nov 19 11:36:45 PST 2017

I may be off, but I think the big gain with lfs_migrate is with live files.
Using rsync, you may end up with a file that is not the same if changes 
occurred during the sync, whereas lfs_migrate is taking those changes 
into account so there is no deviation.
Such "belt and suspenders" will incur overhead, but be safer.

Brian Andrus

On 11/19/2017 10:31 AM, Dauchy, Nathan (ARC-TNC)[CSRA, LLC] wrote:
> Greetings,
> I'm trying to clarify and confirm the differences between lfs_migrate's use of rsync vs. "lfs migrate".  This is in regards to performance, checksumming, and interrupts.  Relevant code changes that introduced the two methods are here:
> https://jira.hpdd.intel.com/browse/LU-2445
> https://review.whamcloud.com/#/c/5620/
> The quick testing I have done is with a 8GB file with stripe count of 4, and included the patch to lfs_migrate from:
> https://review.whamcloud.com/#/c/20621/
> (and client cache was dropped between each test)
> $ time ./lfs_migrate -y bigfile
> real    1m13.643s
> $ time ./lfs_migrate -y -s bigfile
> real    1m13.194s
> $ time ./lfs_migrate -y -f bigfile
> real    0m31.791s
> $ time ./lfs_migrate -y -f -s bigfile
> real    0m28.020s
> * Performance:  The migrate runs faster when forcing rsync (assuming multiple stripes).  There is also minimal performance benefit to skipping the checksum with the rsync method.  Interestingly, performance with "lfs migrate" as the backend is barely effected (and within the noise when I ran multiple tests) by the choice of checksumming or not.  So, my question is whether there is some serialization going on with the layout swap method which causes it to be slower?
> * Checksums:  In reading the migrate code in lfs.c, it is not obvious to me that there is any checksumming done at all for "lfs migrate".  That would explain why there is minimal performance difference.  How is data integrity ensured with this method?  Does the file data version somehow capture the checksum too?
> * Interrupts:  If the rsync method is interrupted (kill -9, or client reboot) then a ".tmp.XXXXXX" file is left.  This is reasonably easy to search for and clean up.  With the lfs migrate layout swap method, what happens to the "volatile file" and it's objects?  Is an lfsck required in order to clean up the objects?
> At this point, the "old" method seems preferable.  Are there other benefits to using the lfs migrate layout swap method that I'm missing?
> Thanks for any clarifications or other suggestions!
> -Nathan
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

More information about the lustre-discuss mailing list