[lustre-discuss] lustre_rsync with growing statuslog

Robert Redl robert.redl at lmu.de
Mon May 23 04:54:52 PDT 2022


Dear Andreas,

thanks a lot for the explanation! Using lustre_rsync is only a temporal 
solution for us. We created a full copy of a lustre system in a 
different location using ZFS snapshots. This system has then been 
renamed and the idea was to keep it for a limited time synchronized with 
the original system. Afterwards the new system should be used in 
production.

With your explanation and a look into the source code, I had now a 
closer look to the created statuslog. The entries are almost exclusively 
pointing to temporal files. Files that are created and deleted or moved 
quickly afterwards. We use the statuslog now to identify directories 
that need synchronization using standard rsync. That works as long as 
the parent of the element to synchronize is still available, for other 
cases periodical traditional rsync will be necessary.

Cheers,
Robert

Am 21.05.22 um 01:32 schrieb Andreas Dilger:
> On May 20, 2022, at 06:33, Robert Redl <robert.redl at lmu.de> wrote:
>>
>> Dear Lustre Experts,
>>
>> since a few weeks we are keeping two Lustre system synchronous using 
>> lustre_rsync. That works fine, but the statuslog file is growing. It 
>> is currently about 500MB in size. Updating it is apparently slowing 
>> down the whole process.
>>
>> Is it only important to keep the statuslog in cases where 
>> lustre_rsync has been interrupted? Or is it necessary to keep it 
>> forever in order to not miss any changes.
>
> It should be noted that lustre_rsync is not commonly used and only 
> tested in the context of an automated regression test that runs and 
> largely passes.  It was developed originally as a proof of concept for 
> Lustre Changelogs, so may be missing support for newer features (e.g. 
> explicit file layouts, project IDs, ACLs, etc, though that *may* all 
> be handled by rsync).  There may be unknown bugs lurking in this code, 
> so use with some caution (i.e. don't sync your bank transaction 
> records with it).
>
> I would recommend to at least use some other tool (e.g. MPIFileUtils) 
> to periodically do a full scan to verify that the files are being 
> copied over properly to the target filesystem.
>
> Taking a quick look into the lustre_rsync.c, I see that "statuslog" 
> appears to be a log file of pending rename actions, or something like 
> that?  It is backed up when lustre_rsync is started, but only to a 
> file <statuslog>.old.  It looks like an entry is added into "parents" 
> if it is renamed to/from a directory that doesn't exist in the target, 
> but I don't know enough detail to say why that isn't working properly 
> in your case.
>
> Feedback, patches, and status updates are welcome.  Maybe you can 
> present about your usage of it at LAD this year?  I don't want to 
> totally discourage your usage of lustre_rsync since it has potential 
> for further improvements (e.g. parallel copying, bug fixing, etc), 
> otherwise it will never get better, but just wanted to make sure you 
> know what the current state of this tool.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220523/67685057/attachment-0001.html>


More information about the lustre-discuss mailing list