[lustre-discuss] lustre_rsync with growing statuslog
Robert Redl
robert.redl at lmu.de
Mon May 23 04:54:52 PDT 2022
Dear Andreas,
thanks a lot for the explanation! Using lustre_rsync is only a temporal
solution for us. We created a full copy of a lustre system in a
different location using ZFS snapshots. This system has then been
renamed and the idea was to keep it for a limited time synchronized with
the original system. Afterwards the new system should be used in
production.
With your explanation and a look into the source code, I had now a
closer look to the created statuslog. The entries are almost exclusively
pointing to temporal files. Files that are created and deleted or moved
quickly afterwards. We use the statuslog now to identify directories
that need synchronization using standard rsync. That works as long as
the parent of the element to synchronize is still available, for other
cases periodical traditional rsync will be necessary.
Cheers,
Robert
Am 21.05.22 um 01:32 schrieb Andreas Dilger:
> On May 20, 2022, at 06:33, Robert Redl <robert.redl at lmu.de> wrote:
>>
>> Dear Lustre Experts,
>>
>> since a few weeks we are keeping two Lustre system synchronous using
>> lustre_rsync. That works fine, but the statuslog file is growing. It
>> is currently about 500MB in size. Updating it is apparently slowing
>> down the whole process.
>>
>> Is it only important to keep the statuslog in cases where
>> lustre_rsync has been interrupted? Or is it necessary to keep it
>> forever in order to not miss any changes.
>
> It should be noted that lustre_rsync is not commonly used and only
> tested in the context of an automated regression test that runs and
> largely passes. It was developed originally as a proof of concept for
> Lustre Changelogs, so may be missing support for newer features (e.g.
> explicit file layouts, project IDs, ACLs, etc, though that *may* all
> be handled by rsync). There may be unknown bugs lurking in this code,
> so use with some caution (i.e. don't sync your bank transaction
> records with it).
>
> I would recommend to at least use some other tool (e.g. MPIFileUtils)
> to periodically do a full scan to verify that the files are being
> copied over properly to the target filesystem.
>
> Taking a quick look into the lustre_rsync.c, I see that "statuslog"
> appears to be a log file of pending rename actions, or something like
> that? It is backed up when lustre_rsync is started, but only to a
> file <statuslog>.old. It looks like an entry is added into "parents"
> if it is renamed to/from a directory that doesn't exist in the target,
> but I don't know enough detail to say why that isn't working properly
> in your case.
>
> Feedback, patches, and status updates are welcome. Maybe you can
> present about your usage of it at LAD this year? I don't want to
> totally discourage your usage of lustre_rsync since it has potential
> for further improvements (e.g. parallel copying, bug fixing, etc),
> otherwise it will never get better, but just wanted to make sure you
> know what the current state of this tool.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220523/67685057/attachment-0001.html>
More information about the lustre-discuss
mailing list