[Lustre-devel] Lustre-devel Digest, Vol 72, Issue 17

Mon Apr 2 17:26:57 PDT 2012

Hi Roman,

> Problem 1
>
> Currently Lustre and test are living in one code space and build in one
> time,and often have specific dependencies between test and code.
>
> This situation directly affect
>
> 1) interoperability testing between different version. (because testing
> is started from client which have different test framework then server
> and client remotely execute test framework as their own. just copying
> tests for equalization could not work with big difference between versions)
>
> 2) it is not simple execute(especially in automation) testing for test.
> F.e. a bug is fixed, the test on it added. Executing the test on an old
> revision(probably on a previous release) should show failed test result.
> But with big difference between versions where fixed and where execute
> test-framework can fail to start.
>
> Possible solution: split Lustre and lustre tests in  code and  build
> levels. It means that lustre and tests will not be connected on code
> revision, only by logic, f.e. via keywords. Also should be added in same
> time an abstraction level in test framework which allows to execute
> lustre utils from different version of lustre.
The situation here is exactly the same as exists with the source code. 
When we run interop tests the test system runs test scripts belonging to 
the server version against those belonging to the client version. So we 
might use 1.8.7 client scripts against 2.2 server scripts. These scripts 
need to inter-operate in exactly the same way that the Lustre source 
code itself needs to interoperate.

If people find cases where this does not happen then they raise bugs 
against Lustre and these are fixed using the same processes as any other 
Lustre bug. This means people across the community investing effort to 
rectify the incompatibilities and then posting patches for review, test 
and inclusion in the product. Of course in a perfect world we would be 
back at the point where 1.8.7 and 2.x forked and never allow the 
interoperability issues in the test scripts to creep in, but the reality 
is the issues do exist and effort does needs to be spent resolving them. 
I am 100% sure that the resolution of the relatively few 
incompatibilities is infinitely easier than attempting to take the 1.8, 
2.1 and 2.2 test scripts and produce an independent one size fits all 
solution.

We currently track each failing test in Jira, see here 
http://jira.whamcloud.com/browse/LU-1193  for an example. If others find 
issues in their testing then they should create Jira issues to track 
them and if possible post patches to resolve them.

>
>
> Problem 2
>
> (to avoid term problems, I call there: sanity = test suite, 130 = test,
> 130c and 130a = test cases)
>
> Different test cases, ended with letter(f.e. 130c),  have an different
> idea of dependencies. Some test cases have dependences to previous test
> cases, some don't have.
>
> All they now can be executed with "ONLY" parameter and all they have
> separated item in result yaml file as just separated tests( which
> doesn't have test cases ended with letter, f.e. sanity 129). Also, tests
> which have testcases and don't have their own body can be execute  with
> ONLY parameter( but doesn't have their special result).
>
> So, logically, all test which can be execute via using ONLY must be not
> depended to other tests. But we have test which depended. Moreover, some
> developers prefer to consider testcases as step of full one test.
>
> What is entities which I call "testcases" and "test" from your point of
> view?
>
> Answer of this question affect automated test execution and test
> development, and maybe ask some test-framework changes.
>
I think you highlight a very good point here that we don't really know 
enough about the test contents, their prerequisites or other 
dependencies. I would suggest that many attempts have been made over the 
years to use naming conventions, numeric ordering or other similar 
mechanisms to track such behaviour.

What we need to make sense of the 1000+ test cases we have is an 
extensible knowledge-base of information that can grow overtime to 
become a rich source of information that can be used to allow automated 
systems as well as developers to confidently use the tests in the most 
flexible way possible.

Because of the nature of Lustre we need to find a way that keeps this 
knowledge within the public domain, provides for use to expand the range 
of things we store about each test and provides for both people and 
machines to access it with equal ease.

One reasonable proposal is to add a comment block at the start of each 
test script and subtest within that script that lists the test name, 
short and long description that includes what the test is supposed to be 
doing, what bug (if any) it was originally added for, what part of the 
code it is intended to cover, prerequisites (filesystem initialization, 
min/max number of clients, OSTs, MDTs it can test with, etc) in a 
machine readable format that it not only documents the test today but 
that can be expanded in the future.

Once we have an agreement on an initial format for this comment block, 
the development community can work to populate it for each subtest and 
improve the understanding and usefulness of all existing tests.

Thanks

Chris Gearing
Sr. Software Engineer
Quality Engineering
Whamcloud Inc