[Lustre-discuss] MDS: lock timed out -- not entering recovery in server code, just going back to sleep
Thomas Roth
t.roth at gsi.de
Thu Nov 27 10:18:34 PST 2008
Hi all,
after some nasty problems with network switches we see repeated hangs of
our Lustre system. On a client, "lfs check mds" will say
> error: check 'lustre-MDT0000-mdc-ffff810419da5800': Resource
temporarily unavailable (11)
All our OSS are doing o.k. The MDS itself seems to be o.k, too - no
error messages in the logs directly related to this situation, but also
nothing that would indicate that the MDT had taken notice of a client
trying to reconnect, not even when trying to mount the FS on a new
client. The MDT had just become unresponsive.
Since nothing goes in such a situation anyhow, we rebooted the MDS.
After recovery, the clients reconnect, the FS seems to be fine again.
However, the MDT is dumping log like crazy - a few times per minute, and
most dumps are empty.
In addition, in the logs I find a lot of
> Nov 27 17:57:41 lustre kernel: LustreError:
3974:0:(ldlm_request.c:64:ldlm_expired_completion_wait()) ### lock timed
out (enqueued
> at 1227804060, 1001s ago); not entering recovery in server code, just
going back to sleep ns: mds-lustre-MDT0000_UUID lock:
> e4f54680/0x2ccbde901a8157f2 lrc: 3/1,0 mode: --/CR res:
74908813/3524601089 bits 0x2 rrc: 173 type: IBT flags: 4004030 remote:
> 0x0 expref: -99 pid 3974
My question is now whether you would interpret this as a result of
ongoing trouble with the network - or is it a sign of MDT-illness?
There are more disturbing log messages, many of the following type:
> Nov 27 18:17:42 lustre kernel: LustreError:
28521:0:(mds_open.c:1474:mds_close()) @@@ no handle for file close ino
81208923:
> cookie 0x2ccbde8fcca85aa7 req at f3e75800 x1686877/t0
> o35->e0c12120-24ea-68c2-0394-712e75354f55 at NET_0x200008cb5726e_UUID:-1
lens 296/3472 ref 0 fl Interpret:/0/0 rc 0/0
What to make of that?
Hm, the MDS is running Lustre v 1.6.3, the OSS 1.6.4.2, the clients
1.6.5 - may not be the most healthy mix, either?
Thanks,
Thomas
More information about the lustre-discuss
mailing list