[Lustre-devel] Oleg/Mike Work on Apps Metrics - FW: Mike Booth week ending 2009.03.15
Oleg.Drokin at Sun.COM
Tue Mar 31 13:58:36 PDT 2009
On Mar 31, 2009, at 2:51 PM, Andreas Dilger wrote:
>>> This latter concept is the basis for the "flash cache" concept.
>>> Actually, I think it's worth exploring the economics of it in more
>> This turns out to be a very true assertion. We (I) do see a huge
>> in e.g. MPI barriers done immediately after write.
> While this is true, I still believe that the amount of delay seen by
> the application cannot possibly be worse than waiting for all of the
> IO to complete. Also, the question is whether you are measuring the
> FIRST MPI barrier after the write, vs e.g. the SECOND MPI barrier
> after the write? Since Lustre is currently aggressively flushing the
> write cache then the first MPI barrier is essentially waiting for all
> of the IO to complete, which is of course very slow. The real item
> of interest is how long the SECOND MPI barrier takes, which is what
> the overhead of Lustre IO is on the network performance.
Second MPI takes 1.5 seconds for me.
> It is impossible that Lustre IO completely saturates the entire
> cross-sectional bandwidth of the system OR the client CPUs, so having
> some amount of computation for "free" during IO is still better than
> waiting for the IO to complete.
No arguments about that from me, I am advocating this same thing from
the very beginning
More information about the lustre-devel