[Lustre-discuss] OST refuses to mount after crash

Wolfgang Baudler wbaudler at gb.nrao.edu
Fri Aug 9 12:26:13 PDT 2013


We are running lustre-2.3.0 on RHEL6 and today one of our OSS nodes had a
kernel crash (within the lustre code). A screen dump after the crash is
available here:

http://www.gb.nrao.edu/~wbaudler/crash.jpg

Nothing was logged on the local or network syslog.

After a hard reset it now refuses to mount one of the OSTs (It has 2, the
other one is fine).

$ mount /export/pulsar/ost81
mount.lustre: /dev/sdb has not been formatted with mkfs.lustre or the
backend filesystem type is not supported by this tool

No syslog messages from the failed mount command, so this looks pretty
severe.

I can still mount it as ldiskfs, but there seems to be stuff missing that
is usually there when mounting as ldiskfs.

A e2fsck dry run didn't turn up anything, except that lost+found is
missing ???

Script started on Fri 09 Aug 2013 01:53:11 PM EDT
[root at nadrach ~]$ e2fsck -fn /dev/sdb
e2fsck 1.42.3.wc3 (15-Aug-2012)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found.  Create? no

Pass 4: Checking reference counts
Pass 5: Checking group summary information

vegas-OST0051: ********** WARNING: Filesystem still has errors **********

vegas-OST0051: 84724/40054144 files (30.6% non-contiguous),
9572242905/10253849088 blocks


I am thinking of making a filesystem level backup of that OST (just in
case I have to create a fresh OST and restore), then mount it ldiskfs
again and try to fix the OST config somehow. The manual has some hints on
how to do this here:

http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#idp4155904

Am I on the right track here? Any help appreciated.

Wolfgang







More information about the lustre-discuss mailing list