[Lustre-discuss] Lustre SNMP module

Thu Mar 20 13:15:04 PDT 2008

Kilian CAVALOTTI <kilian at ...> writes:

> 
> Hi Brian, 
> 
> On Monday 10 March 2008 03:04:33 pm Brian J. Murrell wrote:
> > I can't disagree with that, especially as Lustre installations get
> > bigger and bigger.  Apart from writing custom monitoring tools,
> > there's not a lot of "pre-emptive" monitoring options available. 
> > There are a few tools out there like collectl (never seen it, just
> > heard about it) 
> 
> collectl is very nice, but as dstat and such, it has to run on each and
> every host. It can provide its results via sockets though, so it could
> be used as a centralized monitoring system for a Lustre installation.

I'm the author of collect and so have a few opinions of my own  8-)

Nice to see someone realized there's a method to my madness.  I've been very
frustrated by all the tools out there that come close to solving the distributed
management problem and always seem to leave something out, be it handling the
level of detail one needs to get their job done or providing an ability to look
at historical data or supporting what I consider fine-grained monitoring, that
is taking a sample at one a second [or less!  yes, collectl does support that].

My solution was to focus on one thing and do it well - collectl local data and
provide a rational methodology for archiving it and displaying it, which also
includes being able to plot it.  OK, guess that was more than one.  But I has
also hoped that by supplying hooks, others could build on my work rather than
start all over again, with yet another tool that stands alone.

> And it provides detailled statistics too:
> 
> # collectl -sL -O R
> waiting for 1 second sample...
> 
> # LUSTRE CLIENT DETAIL: READAHEAD
> #Filsys   Reads ReadKB  Writes WriteKB  Pend  Hits Misses NotCon MisWin LckFal
 Discrd ZFile ZerWin RA2Eof HitMax
> home        100    192       0       0     0     0    100      0      0      0
     0      0    100      0      0
> scratch     100    192       0       0     0     0    100      0      0      0
     0      0    100      0      0
> home        102   6294      23     233     0     0     87      0      0      0
     0      0     87      0      0
> scratch     102   6294      23     233     0     0     87      0      0      0
     0      0     87      0      0
> home         95    158      22     222     0     0     81      0      0      0
     0      0     81      0      0
> scratch      95    158      22     222     0     0     81      0      0      0
     0      0     81      0      0
> 
> # collectl -sL -O M
> waiting for 1 second sample...
> 
> # LUSTRE CLIENT DETAIL: METADATA
> #Filsys   Reads ReadKB  Writes WriteKB  Open Close GAttr SAttr  Seek Fsync
DrtHit DrtMis
> home          0      0       0       0     0     0     0     0     0     0   
  0      0
> scratch       0      0       0       0     0     0     2     0     0     0   
  0      0
> home          0      0       0       0     0     0     0     0     0     0   
  0      0
> scratch       0      0       0       0     0     0     0     0     0     0   
  0      0
> home          0      0       0       0     0     0     0     0     0     0   
  0      0
> scratch       0      0       0       0     0     0     1     0     0     0   
  0      0
> 
> # collectl -sL -O B
> waiting for 1 second sample...
> 
> # LUSTRE FILESYSTEM SINGLE OST STATISTICS
> #Ost              Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K Wrts
WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
> home-OST0007        0    0    0    0    0    0    0    0    0    0    0    0 
  0    0    0    0    0    0    0    0    0    0
> scratch-OST0007     0    0    9    0    0    0    0    0    0    0    0   12
3075    9    0    0    0    0    0    0    0    3
> home-OST0007        0    0    0    0    0    0    0    0    0    0    0    0 
  0    0    0    0    0    0    0    0    0    0
> scratch-OST0007     0    0    1    0    0    0    0    0    0    0    0    1 
  2    1    0    0    0    0    0    0    0    0
> home-OST0007        0    0    0    0    0    0    0    0    0    0    0    0 
  0    0    0    0    0    0    0    0    0    0
> scratch-OST0007     0    0    1    0    0    0    0    0    0    0    0    1 
  2    1    0    0    0    0    0    0    0    0
> 
> > and LLNL have one on sourceforge, 
> 
> Last time I checked, it only supported 1.4 versions, but it's been a while, 
> so I'm probably a bit behind.

not sure if you're talking about collectl but in fact I have only just begun to
look at 1.6.4.3 and the good news is I only found a few minor things I need to
fix.  I do plan on releasing a new version of collectl in a couple of days.

> > but I can certainly  
> > see the attraction at being able to monitor Lustre on your servers
> > with the same tools as you are using to monitor the servers' health
> > themselves.
>
> Yes, that'd be a strong selling point.

not only is that a strong point, it' the main point!  When you have multiple
tools trying to track multiple resources and something goes wrong, how are you
expected to do any correlation.  to that point, collectl even tracks time to the
msec and and alings to a whole second within a msec or two and even does it
across a cluster - assuming of course your clocks are synchronized.

> > This could wind becoming a lustre-devel@ discussion, but for now, it
> > would be interesting to extend the interface(s) we use to
> > introduce /proc (and what will soon be it's replacement/augmentation)
> > stats files so that they are automagically provided via SNMP.

given my comments about focusing on one thing and doing it well, I could see if
someone really wanted to export lustre data with snmp they could always use the
--sexpr switch to collectl, telling it to write every sample as an s-expression
which an snmp module could then pick up.  that way you only have one piece of
code worrying about collecting/parsging lustre data.

One other very key point here - at least I think it is - I often see
cluster-based tools collecting data once every minute or even more. 
Alternatively they might choose to only take a small sample of data, the common
theme being any large volume of data will overwhelm a management station. 
Right!  For that reason, I could see one exporting a subset of data upstream
while keeping more details and perhaps at a finer time granularity locally for
debug purposes.

> That sounds like the way to proceed, indeed.
> 
> > You know, given the discussion in this thread:
> > http://lists.lustre.org/pipermail/lustre-devel/2008-January/001475.ht
> >ml now would be a good time for the the community (that perhaps might
> > want to contribute) desiring SNMP access to get their foot in the
> > door. Ideally, you get SNMP into the generic interface and then SNMP
> > access to all current and future variables comes more or less free.
> 
> Oh, thanks for pointing this. It looks like major underlying changes 
> are coming. I think I'll subscribe to the lustre-devel ML to try to 
> follow them.
> 
> > That all said, there are some /proc files which provide a copious
> > amount of information, like brw_stats for instance.  I don't know how
> > well that sort of thing maps to SNMP, but having an SNMP manager
> > watching something as useful as brw_stats for trends over time could
> > be quite interesting.
> 
> Add some RRD graphs to keep historical variations, and you got the 
> all-in-one Lustre monitoring tool we sysadmins are all waiting for. ;)

Be careful here.  You can certain stick some data into an rrd but certainly not
all of it, especially if you want to collect a lot of it at a reasonable
frequency.  If you want accurate detail plots, you've gotta go to the data
stored on each separate system.  I just don't see any way around this, at least
not yet...

As a final note, I've put together a tutorial on using collectl in a lustre
environment and have upload a preliminary copy at
http://collectl.sourceforge.net/Tutorial-Lustre.html in case anyone wants to
preview it before I link it into the documentation.  If nothing else, look at my
very last example where I show what you can see by monitoring lustre at the same
time as your network interface.  Did I also mention that collectl is probably
one of the few tools that can monitor your Infiniband traffic as well?

Sorry or being so long winded...
-mark