[Lustre-discuss] Metadata storage in test script files

Tue May 1 21:14:56 PDT 2012

On 2012-05-01, at 9:23 PM, Roman Grigoryev wrote:
> On 05/01/2012 08:17 PM, Chris wrote:
>> The metadata can be used in a multitude of ways, for example we can
>> create dynamic test sets based on
>> the changes made or target area of testing. What we are doing here is
>> creating an understanding of the
>> tests that we have so that we can improve our processes and testing
>> capabilities in the future.
> 
> I think that when are are defining tool we should say about purpose.
> F.e. good description  and summary is not needed for creating dynamic
> test sets. I think, it very important to say how will we use it.
> Continue of this idea please read below.
> 
>> The metadata does not go to the results. The metadata is a database in
>> it's own right and should metadata about a test be required it would be accessed from the source (database) itself.
> 
> I think fields like title, summary, and, possible. description should be
> present in results too. It can be very helpful for quickly understanding
> test results.

I think what Chris was suggesting is the opposite of what you state here.  He was writing that the "test metadata" under discussion here is the static description of the test to be stored with the test itself.  Chris is specifically excluding any runtime data from being stored with the test, not (as you suggest) excluding the display of this description in the test results.

>>> On 04/30/2012 08:50 PM, Chris wrote:
>>>> Prerequisites:    Pre-requisite tests that must be run before this test can be run. This is again an array which presumes a test may
>>>> have multiple pre-requisites, but the data should not contain a
>>>> chain of prerequisites, i.e. if A requires B and B requires C, the
>>>> pre-requisites of A is B not B & C.
>>> On which step do you want to check chains? And what is logical base
>>> for this prerequisites exclude case that current tests have hidden
>>> dependencies?
>>>  I don't see any difference between one test which have body from tests a,b,c and this prerequisites definition.
>>> Could you please explain more why we need this field?
>> As I said we can mine this data any-time and anyway that we want, and
>> the purpose of this discussion is the data not how we use it. But as
>> an example something that dynamically built
>> test sets would need to know prerequisites.
>> 
>> The suffix of a,b,c could be used to generate prerequisite information
>> but it is firstly inflexible, for example I bet 'b','c' and 'd' are
>> often dependent on 'a' but not each other, secondly and more
>> importantly we want a standard form for storing metadata because we
>> want to introduce order and knowledge into the test
>> scripts that we have today.
> 
> Why I asked about way of usage: if we want to use this information in
> scripts and in other automated way we must strictly specify logic on
> items and provides tool for check it.

I think it is sufficient to have a well-structured repository of test
metadata, and then multiple uses can be found for this data.  Even for
human use, a good description of what the test is supposed to check,
and why this test exists would be a good start.

The test metadata format is extensible, so should we need more fields
in the future it will be possible to add them.  I think the hardest
work will be to get good text descriptions of the tests, not mechanical
issues like dependencies and such.

> F.e. we will use it when built test execution queue. We have chain like
> this: test C prerequisite B, test B prerequisite A. Test A doesn't have
> prerequisite. In one good day test A became excluded. Is it possible to
> execute test C?
> But if we will not use it in scripting there is no big logical problem.
> 
> (My opinion: I don't like this situation and think that test
> dependencies should be used only in very specific and rare case.)
> 
>> 
>>>> TicketIDs:             This is an array of ticket numbers that this test
>>>> explicitly tests. In theory we should aim for the state where
>>>> every ticket has a test associated with it, and in future we
>>>> should be able to carry out a gap analysis.
>>>> 
>>> I suggest add keywords(Components could be translated as keywords too) and test type (stress, benchmark, load, functional, negative,
>>> etc) for quick filtering. For example, SLOW could transform to
>>> keyword.
>> This seems like a reasonable idea although we need a name that describes what it is, we will need to define that set of possible
>> words as we need to with the Components elements.
> 
> I mean that 'keywords' should be separated from components but could be
> logically included. I think, 'Components' is special type of keywords.
> 
>> What should this field be called - we should not reduce the value of
>> this data why genericizing it into 'keywords'.
>> 
>>> Also,  I would like to mention, we have 3 different logical types of
>>> data:
>>> 1) just human-readable descriptions
>>> 2) filtering and targeting fields (Componens, keywords if you agree with
>>> my suggestion)
>>> 3) framework directives(Prerequisites)
>>> 
>>>> As time goes on we may well expand this compulsory list, but this is I
>>>> believe a sensible starting place.
>>>> 
>>>> Being part of the source this data will be subject to the same review
>>>> process as any other change and so we cannot store dynamic data here,
>>>> such as pass rates etc.
>>> What you you think, maybe it is good idea to keep metadata separately?
>>> This can be useful for simplifying changing data via script for mass
>>> modification also as adding tickets and pass rate and execution time on
>>> 'gold' configurations?
>> It would be easier to store the data separately and we could use Maloo
>> but it's very important that this data becomes part of the Lustre
>> 'source' so that everybody can benefit from it. Adding tickets is
>> not a problem as part of the resolution issue is to ensure that at
>> least one test exercises the problem and proves it has been fixed,
>> the fact that this assurance process requires active
>> interaction by an engineer with the scripts is a positive.
>> 
>> As for pass rate, execution time and gold configurations this
>> information is just not 1 dimensional enough to store in the source.
> 
> I'm not accidentally in previous letter said about group of fields. All
> meta data may be separated by rare and often changed fields. F.e.
> Summary will change not so often. But test timeout in golden
> configuration (I mean that this timeout will be set as default based on
> 'gold' configuration and can be overloaded in specific configuration)
> could be more variable(and possible more important for testing).

I think this is something that needs to live outside the test metadata
being described here.  The definition of "golden configuration" is
hard to define, and depends heavily on factors that change from one
environment to the next.

Ideally, tests will be written so that they can run under a wide range
of configurations (number of clients, servers, virtual and real nodes).
A further goal might be to allow many non-destructive functional subtests
to be run in parallel, which would further skew the time taken, but
would allow much more efficient use of test resources.

> Using separated files provides more flexibility and nobody stop us to
> commit it to lustre repo and it became " Lustre 'source'". In separated
> files we can use format which we want and all information will be
> available without parsing shell script or without running it. More over,
> in great future, it give us very simple migration from shell to other
> language.

I think the metadata format should be chosen so that it is trivial to
extract the test metadata without having to execute or parse the shell
(or other) test language itself.  Simple filtering and regexp should
be enough.

> Few words how we done this task in our wrapper test framework(see
> attached sample yaml):
> 
> The file contains set of tags. Main entity is test, in this sample
> element <id> is <Tests> array define logic entity 'test'. Every test
> inherit vales from common description (fields which  described out of
> <Tests> array). A test can override any field or add new fields.
> 
> <groupname>, <executor>, <description>, <reference>, <roles>, <tags> -
> are common fields. All other are executor-specific and used in executors.
> 
> -- 
> Thanks,
> 	Roman
> <conf-sanity_tests.yaml>_______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/