[Lustre-devel] LustreFS performance (update)
Vitaly Fertman
Vitaly.Fertman at Sun.COM
Fri Mar 20 06:15:56 PDT 2009
Hi Andrew,
thanks for you feedback,
indeed, this still looks more like a raw test list than a ready for
publishing
document, but this is a continuous work and I am still working on it,
so I will
try to address you suggestions.
On Mar 19, 2009, at 11:16 PM, Andrew C. Uselton wrote:
> Howdy Vitaly,
> I like this. It is quite comprehensive and detailed. I'd like to
> offer a few constructive criticisms in hope that you will better
> achieve your goals. Mostly I'll stick them in-line where they seem
> relevant, but I'll start with:
> 1) Your write up is quite dense and terse. I could follow the
> overall structure, but found it pretty tough going to understand any
> specific detail. It really helps to work with someone who will
> write up the same information, but in a form with whole sentences
> and a minimum of acronyms or special symbols. Define the acronyms
> you do use in a clear way in one place that I can refer back to.
>
>
> Vitaly Fertman wrote:
>> ****************************************************
>> LustreFS benchmarking methodology.
>> ****************************************************
>> The document aims to describe the benchmarking methodology which
>> helps
>> to understand the LustreFS performance and reveal LustreFS
>> bottlenecks in
>> different configurations on different hardware, to ensure the next
>> LustreFS
>> release does not downgrade comparing with a previous one. In other
>> words:
>> Goal1. Understand the HEAD performance.
>> Goal2. Compare HEAD and b1_6 (b1_8) performance.
>> To achieve the Goal1, the methodology suggests to test different
>> layers of
>> software in the bottom-top direction, i.e. the underlying back-
>> end, the target
>> server sitting on this back-end, the network connected to this
>> target and how
>> the target performs through this network, etc up to the whole
>> cluster.
>
> I like this approach. My own efforts tend to be at-scale testing at
> the whole-cluster end of the range, often in the presence of other
> cluster activity. It is good to have the details of the underlying
> components documented.
>
> ...
>> Obviously, it is not possible to perform all the thousands of tests
>> in all the configurations,
>> running all the special purpose tests, etc, the document tries to
>> prepare:
>> 1) all the essential and sufficient tests to see how the system
>> performs in general;
>> 2) some minimal amount of essential tests to see how the system
>> scales in different
>> conditions.
>
> In some cases it's obvious, but in many it is not clear what exactly
> you mean to be testing. It is a good extension to your methodology
> to state clearly not only the mechanics of the test itself, but what
> you think you are testing with the given experiment. Spend a little
> time and describe what the system is under examination, how it
> responds or should respond to the proposed test, and what tunables
> and parameters you think might be relevant. For instance, if the
> test is supposed to saturate the target server, then how much I/O do
> you expect will be required and why? What timeout or other tunable
> may determine the observed saturation point. Your goal should be to
> have, not only a test, but a real expectation about its results even
> before you run the test. Once you have that expectation then you
> can evaluate the results. The bottom up approach helps with this,
> since you can use the performance of the individual pieces to help
> establish your expectation about the larger assemblies.
>
> ...
>> **** Hardware Requirements. ****
>> The test plan implies that we change only 1 parameter (cpu or disk
>> or network)
>> on each step. The HW requirements are:
>> -- at least 1 node with:
>> CPU:32;
>> RAM: enough to have a ramdisk for MDS;
>> DISK: enough disks for raid6 or raid1+0 (as this node could be
>> mds or ost);
>> an extra disk for external journal;
>> NET: both GiGe and IB installed.
>> -- at least 1 another node includes:
>> DISK: enough disks for raid6 or raid1+0 (as this node could be
>> mds or ost);
>> an extra disk for external journal;
>> -- besides that: 8 clients, 3 other servers.
>> -- the other servers include:
>> DISK: raid6
>> NET: IB installed.
>> -- client includes:
>> NET: both GiGe and IB installed.
>> **** Software requirements ****
> You might provide links to these tests for those not familiar with
> them.
>> 1. Short term.
>> 1.1 mdsrate
>> to be completed to test all the operations listed in MDST3 (see
>> below).
>> 1.2 mdsrate-**.sh
>> to be fixed/written to run mdsrate properly and test all the
>> operations listed in
>> MDST3 (see below).
>> 1.3. fake disk
>> implement FAIL flag and report 'done' without doing anything in
>> obdfilter to get
>> a low-latency disk.
>> 1.4. MT.
>> add more tests here and implement them.
>> 2. Long term.
>> 2.1. mdtstack-survey
>> - an echo client-server is to be written for mds similar to ost.
>> - a test script similar to obdfilter-survey.sh is to be written.
>> **** Different configurations ****
> ...
>
> I'll cut it short here, but in general, I think you might be
> surprised that if you organize this document so that anyone else
> could come along behind you and perform all the same tests in the
> same way, you might get a lot of others doing these experiments
> along side you. That would make your job a lot easier and increase
> the likelihood that bugs and regressions would be caught quickly.
>
>> --
>> Vitaly
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>
> Cheers,
> Andrew
--
Vitaly
More information about the lustre-devel
mailing list