[lustre-discuss] Traffic compression?
esr+lustre at mail.hebrew.edu
Tue Feb 7 09:37:38 PST 2017
On Mon, Feb 6, 2017 at 10:51 PM, Ben Evans <bevans at cray.com> wrote:
> My initial question is what are you measuring and where are you measuring
The tool I'm using is collectl, it in turn is calling perfquery once a
minute and at the end reports back the difference between the previous and
current reading divided by 256*secondInterval to provide a number of kB/s.
(perfquery reports counters /4 legacy left over from 32b counter days)
The lustre stats seem to be gathered more or less the same way, the lustre
plugin does a delta of written/read bytes, divides by 1024 * secondInterval
to get kB/s.
> There are many different layers of caching happening, possibly all at the
> same time. If you're benchmarking it's much better to figure out your max
> sustained read/write speeds than rely on peaks.
I'm not benchmarking, was mainly trying to understand how/why my Infiniband
graphs weren't showing at least the same amount of traffic as Lustre...
Most of the time though the graphs do more or less coincide so I guess
maybe there was either a measurement glitch or we do see some limited
effects of caching.
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of
"E.S. Rosenberg" <esr+lustre at mail.hebrew.edu>
Date: Monday, February 6, 2017 at 3:25 PM
To: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] Traffic compression?
We started closer monitoring of resources on our cluster and I noticed that
there is sometimes a big discrepancy between the read traffic reported by
Lustre and the incoming traffic reported by infiniband (which is the
interace carrying the Lustre traffic).
Currently I have a 4.4GB peak on Lustre while Infiniband at the same time
is showing just 1.4GB/s traffic (also there is a 2 minute difference
between the 2 peaks)
This is the summation of all the nodes (without the servers) in the cluster.
The stats are gathered using collectl at a 1 minute interval.
(There are also lots of stats that match 1:1 which makes me less sure what
to make of this)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lustre-discuss