[Lustre-discuss] Metadata storage in test script files

Roman Grigoryev Roman_Grigoryev at xyratex.com
Tue May 1 20:23:20 PDT 2012

Hi Cris,

On 05/01/2012 08:17 PM, Chris wrote:
> On 30/04/2012 19:15, Roman Grigoryev wrote:
>> Hi Cris,
>> I'm glad to read next emails on this direction.
>> Please don't consider this as criticism, I just would like to get more
>> clearness: what is target of adding this metadata? Do you have plans to
>> use the metadata in other scripts? How? Does this metadata go to to
>> results?
>> Also please see more my comments inline:
> The metadata can be used in a multitude of ways, for example we can
> create dynamic test sets based on
> the changes made or target area of testing. What we are doing here is
> creating an understanding of the
> tests that we have so that we can improve our processes and testing
> capabilities in the future.

I think that when are are defining tool we should say about purpose.
F.e. good description  and summary is not needed for creating dynamic
test sets. I think, it very important to say how will we use it.
Continue of this idea please read below.

> The metadata does not go to the results. The metadata is a database in
> it's own right and should metadata
> about a test be required it would be accessed from the source (database)
> itself.

I think fields like title, summary, and, possible. description should be
present in results too. It can be very helpful for quickly understanding
test results.

>> On 04/30/2012 08:50 PM, Chris wrote:
> ... snip ...
>>> Prerequisites:    Pre-requisite tests that must be run before this test
>>> can be run. This is again an array which presumes a test may have
>>> multiple pre-requisites, but the data should not contain a chain of
>>> prerequisites, i.e. if A requires B and B requires C, the pre-requisites
>>> of A is B not B&  C.
>> On which step do you want to check chains? And what is logical base for
>> this prerequisites exclude case that current tests have hidden
>> dependencies?
>>   I don't see any difference between one test which have body from tests
>> a,b,c and this prerequisites definition.
>> Could you please explain more why we need this field?
> As I said we can mine this data any-time and anyway that we want, and
> the purpose of this
> discussion is the data not how we use it. But as an example something
> that dynamically built
> test sets would need to know prerequisites.
> The suffix of a,b,c could be used to generate prerequisite information
> but it is firstly inflexible, for example
> I bet 'b','c' and 'd' are often dependent on 'a' but not each other,
> secondly and more importantly we want a
> standard form for storing metadata because we want to introduce order
> and knowledge into the test
> scripts that we have today.

Why I asked about way of usage: if we want to use this information in
scripts and in other automated way we must strictly specify logic on
items and provides tool for check it.

F.e. we will use it when built test execution queue. We have chain like
this: test C prerequisite B, test B prerequisite A. Test A doesn't have
prerequisite. In one good day test A became excluded. Is it possible to
execute test C?
But if we will not use it in scripting there is no big logical problem.

(My opinion: I don't like this situation and think that test
dependencies should be used only in very specific and rare case.)

>>> TicketIDs:             This is an array of ticket numbers that this test
>>> explicitly tests. In theory we should aim for the state where every
>>> ticket has a test associated with it, and in future we should be able to
>>> carry out a gap analysis.
>> I suggest add keywords(Components could be translated as keywords too)
>> and test type (stress, benchmark, load, functional, negative, etc) for
>> quick filtering. For example, SLOW could transform to keyword.
> This seems like a reasonable idea although we need a name that describes
> what it is,
> we will need to define that set of possible words as we need to with the
> Components elements.

I mean that 'keywords' should be separated from components but could be
logically included. I think, 'Components' is special type of keywords.

> What should this field be called - we should not reduce the value of
> this data why genericizing it
> into 'keywords'.
>> Also,  I would like to mention, we have 3 different logical types of
>> data:
>> 1) just human-readable descriptions
>> 2) filtering and targeting fields (Componens, keywords if you agree with
>> my suggestion)
>> 3) framework directives(Prerequisites)
>>> As time goes on we may well expand this compulsory list, but this is I
>>> believe a sensible starting place.
>>> Being part of the source this data will be subject to the same review
>>> process as any other change and so we cannot store dynamic data here,
>>> such as pass rates etc.
>> What you you think, maybe it is good idea to keep metadata separately?
>> This can be useful for simplifying changing data via script for mass
>> modification also as adding tickets and pass rate and execution time on
>> 'gold' configurations?
> It would be easier to store the data separately and we could use Maloo
> but it's very important
> that this data becomes part of the Lustre 'source' so that everybody can
> benefit from it. Adding
> tickets is not a problem as part of the resolution issue is to ensure
> that at least one test exercises
> the problem and proves it has been fixed, the fact that this assurance
> process requires active
> interaction by an engineer with the scripts is a positive.
> As for pass rate, execution time and gold configurations this
> information is just not 1 dimensional
> enough to store in the source.

I'm not accidentally in previous letter said about group of fields. All
meta data may be separated by rare and often changed fields. F.e.
Summary will change not so often. But test timeout in golden
configuration (I mean that this timeout will be set as default based on
'gold' configuration and can be overloaded in specific configuration)
could be more variable(and possible more important for testing).

 Using separated files provides more flexibility and nobody stop us to
commit it to lustre repo and it became " Lustre 'source'". In separated
files we can use format which we want and all information will be
available without parsing shell script or without running it. More over,
in great future, it give us very simple migration from shell to other

Few words how we done this task in our wrapper test framework(see
attached sample yaml):

The file contains set of tags. Main entity is test, in this sample
element <id> is <Tests> array define logic entity 'test'. Every test
inherit vales from common description (fields which  described out of
<Tests> array). A test can override any field or add new fields.

<groupname>, <executor>, <description>, <reference>, <roles>, <tags> -
are common fields. All other are executor-specific and used in executors.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: conf-sanity_tests.yaml
Type: application/x-yaml
Size: 1546 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120502/ad6182fe/attachment.bin>

More information about the lustre-discuss mailing list