[lustre-discuss] Data migration software

Stephane Thiell sthiell at stanford.edu
Wed Mar 22 15:48:36 PDT 2023


Hi Anna,

We’re about to deploy Lustre/HSM with Phobos for a new large research data archival system at Stanford (200PB).

https://github.com/phobos-storage

Phobos is open source and a Lustre copytool is available. Archiving policies can be set up via Robinhood like with other HSMs. Robinhood is also open source and supports project IDs if you take the patches from GerritHub (like this one: https://review.gerrithub.io/c/cea-hpc/robinhood/+/541104 but more are needed, I can give you the list if needed). Data restore concurrency should be well handled with Lustre/HSM.
A Lustre userspace coordinator named “coordinatool" is required for using Phobos in multi-server mode, but it is also freely available on GitHub. We plan to have a dedicated Lustre client for the coordinatool.

Hope that helps.

Stéphane


> On Mar 22, 2023, at 7:47 AM, Anna Fuchs via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
> 
> Dear all,
> 
> if you have a large Lustre storage and a large tape archive and maybe even additionally some in-house cloud storage, which software do you use for more or less automatic data migration, that has good scaling?
> Ideally it somehow supports Lustre project quota and more important a synchronized catalog to find the data.
> E.g. if the data to be read is on tape, it somehow transparently moves it to the main faster storage (like Lustre) without the user explicitly knowing (at least not required to) where the data has been initially stored.
> If another user wants to access the same shared file, the software would know it is already "buffered" on Lustre and wouldn't read it again from tape.
> Or If the data has not been touched (really processed, not just touch) for a certain period of time, or the user runs out of Lustre quota, but has free archive space, it would be automatically archived on tape and so on.
> Ideally, the software should not cost a billion for a year license or even be open source :)
> 
> Thank you
> Anna Fuchs
> --
> Universität Hamburg
> https://wr.informatik.uni-hamburg.de/people/anna_fuchs
> 
> 
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list