[Lustre-discuss] OSS load in the roof
Brock Palen
brockp at umich.edu
Fri Jun 27 09:44:28 PDT 2008
our OSS went crazy today. It is attached to two OST's.
The load normally around 2-4. Right now it is 123.
I noticed this to be the cause:
root 6748 0.0 0.0 0 0 ? D May27 8:57
[ll_ost_io_123]
All of them are stuck in un-interruptible sleep.
Has anyone seen this happen before? Is this caused by a pending disk
failure?
I ask the disk system failure because I also see this message:
mptscsi: ioc1: attempting task abort! (sc=0000010038904c40)
scsi1 : destination target 0, lun 0
command = Read (10) 00 75 94 40 00 00 10 00 00
mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40)
and:
Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup-
OST0001: slow setattr 100s
Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog
for pid 6698 disabled after 103.1261s
Thanks
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
More information about the lustre-discuss
mailing list