[lustre-discuss] [HPDD-discuss] Tiered storage
andreas.dilger at intel.com
Thu Jul 13 14:33:35 PDT 2017
> On Jul 7, 2017, at 16:06, Abe Asraoui <AbeA at supermicro.com> wrote:
> Hi All,
> Does someone knows of a configuration guide for Lustre tiered storage ?
I don't think there is an existing guide for this, but it is definitely something we are looking into.
Currently, the best way to manage different storage tiers in Lustre is via OST pools. As of Lustre 2.9 it is possible to set a default OST pool on the whole filesystem (via "lfs setstripe" on the root directory) that is inherited for new files/directories that are created in directories that do not already have a default directory layout. Also, some issues with OST pools were fixed in 2.9 related to inheriting the pool from a parent/filesystem default if other striping parameters are specified on the command line (e.g. set pool on parent dir, then use "lfs setstripe -c 3" to create a new file). Together, these make it much easier to manage different classes of storage within a single filesystem.
Secondly, "lfs migrate" (and the helper script lfs_migrate) allow migration (movement) of files between OSTs (relatively) transparently to the applications. The "lfs migrate" functionality (added in Lustre 2.5 I think) keeps the same inode, while moving the data from one set of OSTs to another set of OSTs, using the same options as "lfs setstripe" to specify the new file layout. It is possible to migrate files opened for read, but it isn't possible currently to migrate files that are being modified (either this will cause migration to fail, or alternately it is possible to block user access to the file while it is being migrated).
The File Level Redundancy (FLR) feature currently under development (target 2.11) will improve tiered storage with Lustre, by allowing the file to be mirrored on multiple OSTs, rather than having to be migrated to have a copy exclusively on a single set of OSTs. With FLR it would be possible to mirror input files into e.g. flash-based OST pool before a job starts, and drop the flash mirror after the job has completed, without affecting the original files on the disk-based OSTs. It would also be possible to write new files onto the flash OST pool, and then mirror the files to the disk OST pool after they finish writing, and remove the flash mirror of the output files once the job is finished.
There is still work to be done to integrate this FLR functionality into job schedulers and application workflows, and/or have a policy engine that manages storage tiers directly, but depending on what aspects you are looking at, some of the functionality is already available.
Lustre Principal Architect
More information about the lustre-discuss