[Lustre-discuss] MDT crash: ll_mdt at 100%

Thomas Roth t.roth at gsi.de
Thu Jul 2 05:22:23 PDT 2009


Hi all,

our MDT gets stuck and unresponsive with very high loads (Lustre
1.6.7.1, Kernel 2.6.22, 8 Core, 32GB RAM). The only thing calling
attention is one ll_mt_?? process running with 100% cpu. Nothing unusual
happening on the cluster before that.
After reboot as well as after moving the service to another server, this
behavior reappears. The initial stages - mounting MGS, mouting MDT,
recovery - work fine, but then the load goes up and the system is
rendered unusable.

Atm, I don't know what to do, except shutting down all servers and
possible do a writeconf everywhere.

I see that a similar problem was reported by Mag in March this year, but
no clues or solutions appeared.
Any ideas?

Yours,
Thomas



More information about the lustre-discuss mailing list