[Lustre-devel] Global generic database

Thu Feb 14 19:32:16 PST 2008

Nathaniel Rutman wrote:
> Peter J Braam wrote:
>> Hmm ... here are my thoughts.
>>
>> 1. The word scalable is missing below.
> That is implicit in any Lustre design :)
>>
>> 2. Any database that relates to file system policies and file system 
>> objects (HSM?) should be a separate mechanism coupled to the file 
>> system, so that you can pick up the server disks and the policies.
> What I am trying to avoid is multiple mechanisms to reduce the number 
> of database implementations we have to write/maintain.
>>
>> 3. I think all updates to the database should be made on the server, 
>> and the use cases should be restricted (e.g. this is for relatively 
>> small databases).
> Maybe updates can only be made on the server, but the data needs to be 
> readable from anywhere.
>>
>> 4. Imho pools belong in the configuration log.
> Pool definitions can easily be put in the configuration logs - but 
> pool policies can be complex ("all .mov files greater than 10GB go
> to pool 7") and malleable - configuration logs are not easily 
> accessible, not random access (config log records are arbitrary size, 
> so we must walk the file from the beginning to find a record).  If 
> they grow too big performance will suffer.
>> 5. Fileset attributes belong with the file system (see 2) - either 
>> these are implemented as special directory files and/or EA's (does 
>> the design specify the purpose and items that need to be stored in 
>> databases?).
> Fileset membership is stored with the filesystem (EAs), but fileset 
> policies may again be larger, complex entities that should probably be 
> stored once in a central database, and looked up as needed.  For the 
> 10,000 fileset case, clearly we don't want to read in 10,000 fileset 
> policies fro 
> the config log at startup; they should be loaded on-demand as needed.

They need to be in the filesystem, not on the management server.

- Peter -

>>
>> Hmm, so can we revisit why we need a new database mechanism?
>>
>> - Peter -
>>
>>
>>
>> Nathaniel Rutman wrote:
>>> The design of various new features in Lustre call for global 
>>> (filesystem wide) databases, accessible from
>>> clients or other servers:
>>> A. pools - pool descriptions (pool #1 = OSTs 1-10,30-60), pool 
>>> policies (all .jpg files to pool #1)
>>> B. filesets - fileset policies (log creates on fileset #1 to feed 
>>> "foo")
>>> C. HSM - (aureleien - what was the use case here?)
> Space manager policies
>>>
>>> We've already implemented at least 2 of these:
>>> D. Fid Location Database - (is this done?)
>>> E. configuration parameters - stored in MGS llogs
>>>
>>> Rather than continue 1-off implementations, I think it's time we 
>>> came up with a consistent,
>>> global, generic database mechanism for A-C as well as other future 
>>> uses.
>>> Needs to be:
>>> 1. Fast. We need to cache database entries locally, which also means 
>>> having them under locks.
>>>     a. local caching
>>>     b. locks
>>> 2. Generic.  Store any kind of data, not limited to 8k page 
>>> boundaries, etc.
>>> 3. Transactional.  Power loss doesn't lead to inconsistent state.
>>> 4. Recoverable. Client changes are replayed if need be.
>>> 5. Remotely accessible, from a client or other servers.
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>   
>