[Lustre-discuss] OSS crashes
Mark Seger
Mark.Seger at hp.com
Wed Jul 23 08:49:32 PDT 2008
>> Where else could I look for overloaded hardware capacities?
>>
>
> Not sure. That's quite hardware specific.
>
you could run collectl and then after you reset the system log back in
and look at what was happening right before you did the reset. this
will let you look at cpu, interrupts, memory, network and a variety of
other things including lustre level stats such as I/O rates and even rpc
stats. you'll also be able to see what processes were running in a
similar format to ps or you can just play back the data with the --top
switch. if you feel 10 second samples aren't frequent enough you always
set you interval down to 1 second or even lower...
-mark
More information about the lustre-discuss
mailing list