[Lustre-discuss] Cannot get an OST to activate

Bernd Schubert bs_lists at aakef.fastmail.fm
Fri Sep 3 14:22:24 PDT 2010


On Friday, September 03, 2010, Bob Ball wrote:
> We added a new OSS to our 1.8.4 Lustre installation.  It has 6 OST of
> 8.9TB each.  Within a day of having these on-line, one OST stopped
> accepting new files.  I cannot get it to activate.  The other 5 seem fine.
> 
> On the MDS "lctl dl" shows it IN, but not UP, and files can be read from
> it: 33 IN osc umt3-OST001d-osc umt3-mdtlov_UUID 5
> 
> However, I cannot get it to re-activate:
> lctl --device umt3-OST001d-osc activate
> 

[...]


> LustreError: 4697:0:(filter.c:3172:filter_handle_precreate())
> umt3-OST001d: ignoring bogus orphan destroy request: obdid
> 11309489156331498430 last_id 0
> 
> Can anyone tell me what must be done to recover this disk volume?

Check out section 23.3.9 in the Lustre manual ("How to Fix a Bad LAST_ID on an 
OST).

It is on my TODO list to write tool to automatically correct the "lov_objid", 
but as of now I don't have it yet. Somehow your lov_objid file has a 
completely wrong value for this OST.
Now, when you say "files can be read from it", are you sure there are already 
files on that OST? Because the error message says that the last_id is zero and 
so you should not have a single file on it. If that is also wrong, you will 
need to correct it as well. You can do that manually, or you can use a patched 
e2fsprogs version, that will do that for you

Patches are here:
https://bugzilla.lustre.org/show_bug.cgi?id=22734

Packages can be found on my home page:
http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/e2fsprogs/


If you want to do it automatically, you will need to create a lfsck mdsdb file 
(the hdr file is sufficient, see the lfsck section in the manual) and then you 
will need to run e2fsck for that OST as if you want to create an OSTDB file. 
That will start pass6, and if you then run e2fsck *without* "-n", so in 
correcting mode, it will correct the LAST_ID file to what it finds on disk. 
With "-v" it will also tell you the old and the new value and then you will 
need to put that value properly coded into the MDS lov_objid file.


Be careful and create backups of the lov_objid and LAST_ID files.


Hope it helps,
Bern



-- 
Bernd Schubert
DataDirect Networks



More information about the lustre-discuss mailing list