Before I forget - can someone point me to HPSS API?? In SAM Shared QFS, we have a File System type called "mat" that essentially is an Object File System - basically no name space. There are a few system control files that do have names in root - just a handfull. All other User files are accessed via SAMQFS FID - 32 bits inode number and 32 bits generation number. This file system is almost POSIX compliant - we cannot do a normal create - no pathname. However, all other calls can be supported. Eventhough, an "ls *" type operation is troublesome because, it is flat and it attempts to display every single file in the FS. Minor NIT. You can actually create files with normal pathnames in this file system - but we should restrict this capability to only what is required by SAMQFS. Object files can be created using an ioctl. It returns an Object SAMQFS identifier and you can use it to do normal file system calls .. This file system can be NFS mounted. It can also be Shared via SAMQFS. You give it the ASCII representation of the SAMQFS file id - name lookup does the magic. It can also be supported by SAM(archiver) - keep in mind that policies based on pathnames are not supported. However, this feature can be added - but I do not see why we will need this feature for archiving Lustre files. Keep these kind of policies on the Lustre side and not the HSM side. It is supported on both Linux and Sparc clients. requires further testing - as usual. Object SAMQFS - HSM for Lustre ------------------------------ 0. We re basically looking at the HSM as a Repository right? 1. Use Object SAMQFS. 2. Object SAMQFS meta data(inodes) is used as a database for files that are archived etc. 3. This database can be dumped and restored really quick using normal meta data backup of the HSM. The inodes are kept in 1 file. This is not a Lustre dump but rather a dump of Object SAMQFS. No file data dump is required. Files not archived yet are irrelevent .. Incrementals can be obtained by comparing 2 full dumps and just keeping the diffs. Persistent Object SAMQFS file id can be preserved if we restore a complete version of the dump. Otherwise, it can be different. We can update Lustre with the new file id for the given Lustre File ID. Consider this error recovery path .. 4. Object SAMQFS should have very simple policies - archive immediate, number of copies and when copies to be made etc.. This can actually be passed by Lustre and executed by Object SAMQFS. Last thing we want to do is to have to configure 2 Policy engines. 5. Lustre will store a 16 Bytes Object SAMQFS identifier. A 8 bytes unique file system ID and a 8 bytes Object SAMQFS File ID. An Object SAMQFS can only support 32 bits number of files. This will be less if we use inodes for extended attributes etc. The file system ID will allow us to create multiple Object SAMQFS "mat" file system - provide infinite number of files that can be supported. 6. No namepace. No namespace. Lustre pathnames can be stored as Extended Attributes. 7. Files to be archived and staged in together(associative archiving) to be given in a list by Lustre. Object SAMQFS will figure out a way to link these files together and put them on the same tarball - this is not for free. 8. Lustre to provide sufficient meta data to create a standard Tar File Header and also any other Lustre Meta to allow files to be restored into Lustre in both normal restore and also Ultimate Disaster restore from just the set of tapes. 9. Standard Tar Format. This is a good policy. 10. In general, Object SAMQFS should not even know we are the HSM for Lustre. It can be any other file system that can make use of the APIs. It is a repository - hopefully with some very differentiating functionalities compared to other repositories out in the market. I actually have no idea. 11. Lustre can status the Object SAMQFS on Object SAMQFS clients by using the standard extended ls command "sls". This is not supported via NFS. You will have to give it the ASCII representation of Object SAMQFS Identifier. Details will follow later. 12. Lustre can use normal Posix API write/read etc. to transfer file data using the given Object SAMQFS identifier. Details on how an identifier gets created during an archive event - which may be several steps - will be discussed later. 13. Users must not need to log into the Object SAMQFS node to see file state. Lustre must provide this capability on Lustre nodes. Basic Object SAMQFS - HSM for Lustre Archive Events ------------------------------------------- Lustre calls with the following Information: 1. Luster FID 2. Luster Opaque Meta Data 3. Luster Tar File required Data e.g. Path Name 4. Luster Archiving Policy for this file - must be simple. Lustre gets back: 1. Object SAMQFS Identifier. Depending on asynchronous or synchronous archiving: 1. Lustre can status with the given "Object SAMQFS Identifier" Basic Object SAMQFS - HSM for Lustre Stage In Events(bring data back) --------------------------------------------------------------------- 1. Lustre just reads the file with the given "Object SAMQFS Identifier" Basic Object SAMQFS - HSM for Lustre status Events(check state) 1. Lustre perform "sls" command on Object SAMQFS Client. PS - We can have both User level command and API capabilities. Basic Object SAMQFS - HSM for Lustre Delete Event ------------------------------------------------- 1. Lustre can effectively do an "rm" on the Object SAMQFS Identifier or calls an API. Object SAMQFS Dump and Restore ------------------------------ Independent Administrative event. Lustre Dump and Restore ----------------------- Can be an Independent Lustre event. However, this does have impact on when we can actually delete a file from tape if a Lustre Dump has a reference to this file e.g. 1. Archive file. 2. Dump Lustre. 3. Delete file. Now you want to restore the deleted file. Ultimate Disaster Recovery - Directly from Tapes ------------------------------------------------ Requires Tar File to be complete with Lustre Meta Data. Since this is a recreation of both the Lustre FS and Object SAMQFS "mat" FS I would be incline to believe that at a minimum, we will not require the Object SAMQFS identifier to be persistent from previous incantation. I am also incline to believe that if you take regular Object SAMQFS dumps, both full and also incrementals and store this safely on tape - you may not need this procedure .. but then, that's why we call it Ultimate Recovery. Syncing Object SAMQFS with Lustre --------------------------------- Lustre File Identifier and Object SAMQFS Identifier can get out of sync - shit happens. We need syncing capabilities. Object SAMQFS - Freeing space on tapes -------------------------------------- We will need a way to determine with Lustre - conclusively that an archive is no longer needed. Keep in mind that some of the functionalities are already there and tested, some are there and barely tested, some are there and not tested, while others need to be developed. How do we manage source control and matchup Object SAMQFS source base with Lustre with respect to releases? What kind of testing resources do we require? Please comment.