[Lustre-devel] Oleg/Mike Work on Apps Metrics - FW: Mike Booth week ending 2009.03.15
Oleg Drokin
Oleg.Drokin at Sun.COM
Tue Mar 31 21:34:50 PDT 2009
Hello!
On Mar 31, 2009, at 11:55 PM, Michael Booth wrote:
> (My Opinion) The large size of the I/O request put onto the SeaStar
> by the Lustre client is giving it an artificially high priority.
> Barriers are just a few bytes, the I/Os from the client are in
> megabytes. SeaStar has no priority in is queue, but the amount of
> time it takes to clear megabyte request results in a priority that
> is thousands of times more impact on the hardware than the small
> synchronization requests of many collectives. I am wondering if the
> interference from I/O to computation is more an artifact of message
> size and bursts, than of congestion or routing inefficiencies in
> seastar..
> If there are hundreds of megabytes of request queued up on the
> network, and there is no priority way to push a barrier or other
> small mpi request up on the queue, it is bound to create a disruption.
> To borrow the elevator metaphor from Eric, if all the elevators are
> queued up from 8:00 to 9:00 delivering office supplies on carts that
> occupy the entire elevator, maybe the carts should be smaller, and
> limited to a few per elevator trip.
As we discussed in the past, just sending small i/o messages is going
to uncover all kinds of slowdowns all the way back to the disk storage,
and the collateral damage would be other tasks that do need fast i/o
and do send big chunks of data.
Bye,
Oleg
More information about the lustre-devel
mailing list