<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Thank you, Bern.  "df" claims there is some 442MB of data on the

volume, compared to neighbors with 285GB.  That could well be a

fragment of a single, unsuccessful transfer attempt.  I can run

lfs_find on it though and see what comes back.  Was having problems

earlier, thought I got files back from that command, but other problems

on our cluster confused that result.  We will recheck.<br>

<br>

bob<br>

<br>

Bernd Schubert wrote:

<blockquote cite="mid:201009032322.25265.bs_lists@aakef.fastmail.fm"

 type="cite">

  <pre wrap="">On Friday, September 03, 2010, Bob Ball wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">We added a new OSS to our 1.8.4 Lustre installation.  It has 6 OST of

8.9TB each.  Within a day of having these on-line, one OST stopped

accepting new files.  I cannot get it to activate.  The other 5 seem fine.

On the MDS "lctl dl" shows it IN, but not UP, and files can be read from

it: 33 IN osc umt3-OST001d-osc umt3-mdtlov_UUID 5

However, I cannot get it to re-activate:

lctl --device umt3-OST001d-osc activate

    </pre>

  </blockquote>

  <pre wrap=""><!---->

[...]

  </pre>

  <blockquote type="cite">

    <pre wrap="">LustreError: 4697:0:(filter.c:3172:filter_handle_precreate())

umt3-OST001d: ignoring bogus orphan destroy request: obdid

11309489156331498430 last_id 0

Can anyone tell me what must be done to recover this disk volume?

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Check out section 23.3.9 in the Lustre manual ("How to Fix a Bad LAST_ID on an 

OST).

It is on my TODO list to write tool to automatically correct the "lov_objid", 

but as of now I don't have it yet. Somehow your lov_objid file has a 

completely wrong value for this OST.

Now, when you say "files can be read from it", are you sure there are already 

files on that OST? Because the error message says that the last_id is zero and 

so you should not have a single file on it. If that is also wrong, you will 

need to correct it as well. You can do that manually, or you can use a patched 

e2fsprogs version, that will do that for you

Patches are here:

<a class="moz-txt-link-freetext" href="https://bugzilla.lustre.org/show_bug.cgi?id=22734">https://bugzilla.lustre.org/show_bug.cgi?id=22734</a>

Packages can be found on my home page:

<a class="moz-txt-link-freetext" href="http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/e2fsprogs/">http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/e2fsprogs/</a>

If you want to do it automatically, you will need to create a lfsck mdsdb file 

(the hdr file is sufficient, see the lfsck section in the manual) and then you 

will need to run e2fsck for that OST as if you want to create an OSTDB file. 

That will start pass6, and if you then run e2fsck *without* "-n", so in 

correcting mode, it will correct the LAST_ID file to what it finds on disk. 

With "-v" it will also tell you the old and the new value and then you will 

need to put that value properly coded into the MDS lov_objid file.

Be careful and create backups of the lov_objid and LAST_ID files.

Hope it helps,

Bern

  </pre>

</blockquote>

</body>

</html>