[lustre-discuss] one ost down

Fri Nov 15 03:47:42 PST 2019

Hi Einar,

As for the OST in bad shape, if you have not cleared the bad blocks on the storage system you’ll keep having IO errors when your server tries to access these blocks, that’s kind of a protection mechanism and lots of IO errors might give you many issues. The procedure to clean them up is a bit of storage and filesystem surgery. I would suggest, this high level view plan:

  *   Obtain the bad blocks from the storage system (offset, size, etc…)
  *   Map them to filesystem blocks: watch out, the storage system speaks probably and for old systems about 512bytes blocks and the filesystem blocks are 4KB, so you need to map storage blocks to filesystem blocks
  *   Clear the bad blocks on the storage system, each storage system has their own commands to clear those. You’ll probably no longer have IO errors accessing these sectors after clearing the bad blocks
  *   Optional, zero the bad storage blocks with dd (and just these bad blocks of course) to ignore the “trash” there might be on these blocks
  *   Find out with debugfs which files are affected
  *   Run e2fsck on the device

As I said, surgery, so if you really care about what you have on that device try to do a block level backup before… But the minimum for sure is that you need to clear the bad blocks, otherwise you get IO access error on the device.

Regards,

Diego

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Einar Næss Jensen <einar.nass.jensen at ntnu.no>
Date: Friday, 15 November 2019 at 10:01
To: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] one ost down

Hello dear lustre community.

We have a lustre file system, where one ost is having problems.

The underlying diskarray, an old sfa10k from DDN (without support), have one raidset with ca 1300 bad blocks. The bad blocks came about when one disk in the raid failed while another drive in other raidset was rebuilding.

Now.

The ost is offline, and the file system seems useable for new files, while old files on the corresponding ost is generating lots of kernel messages on the OSS.

Quotainformation is not available though.

Questions:

May I assume that for new files, everything is fine, since they are not using the inactive device anyway?

I tried to run e2fschk on the ost unmounted, while jobs were still running on the filesystem, and for a few minutes it seemd like this was working, as the filesystem seemed to come back complete afterwards. After a few minutes the ost failed again, though.

Any pointers on how to rebuild/fix the ost and get it back is very much appreciated.

Also how to regenerate the quotainformation, which is currently unavailable would help. With or without the troublesome OST.

Best Regards

Einar Næss Jensen (on flight to Denver)

--
Einar Næss Jensen
NTNU HPC Section
Norwegian University of Science and Technoloy
Address: Høgskoleringen 7i
         N-7491 Trondheim, NORWAY
tlf:     +47 90990249
email:   einar.nass.jensen at ntnu.no
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20191115/1d2cd42d/attachment.html>