[Lustre-discuss] OSS crashes

Mark Seger Mark.Seger at hp.com
Wed Jul 23 08:49:32 PDT 2008


>> Where else could I look for overloaded hardware capacities?
>>     
>
> Not sure.  That's quite hardware specific.
>   
you could run collectl and then after you reset the system log back in 
and look at what was happening right before you did the reset.  this 
will let you look at cpu, interrupts, memory, network and a variety of 
other things including lustre level stats such as I/O rates and even rpc 
stats.  you'll also be able to see what processes were running in a 
similar format to ps or you can just play back the data with the --top 
switch.  if you feel 10 second samples aren't frequent enough you always 
set you interval down to 1 second or even lower...
-mark




More information about the lustre-discuss mailing list