[Lustre-discuss] recovering formatted OST

Bernd Schubert bs_lists at aakef.fastmail.fm
Tue Oct 26 08:49:53 PDT 2010


On Tuesday, October 26, 2010, Wojciech Turek wrote:
> Hi,
> 
> There is a LAST_ID file on the OST and indeed it equals a highest object
> number
> 
> [root at oss09 ~]# od -Ax -td8 /tmp/LAST_ID
> 000000              2490599
> 000008
> 
> [root at oss09 ~]# ls -1s /mnt/ost/O/0/d* | grep -v [a-z] | sort -k2 -n | tail
> -1
>       8 2490599
> 
> However MDS seem to think differently.
> 
> root at mds03 ~]# lctl get_param osc.*.prealloc_last_id | grep OST0010
> osc.scratch2-OST0010-osc.prealloc_last_id=1

Yeah.

> 
> Is this caused by deactivating the OST on the MDS? I have deactivated  OST
> on  MDS using this command:
> 
> lctl --device 19 conf_param scratch2-OST0010.osc.active=0
> 
> I looked into lov_objid reported by the MDS but I am not sure how to
> interpret the output correctly
> [root at mds03 ~]# od -Ax -td8 /tmp/lov_objid
> 000000              2073842              2100049
> 000010              2115247              2038471
> 000020              2119821              2190996
> 000030              2029234              2354424
> 000040              2160856              2167105
> 000050              1970351              2059045
> 000060              2706486              2571655
> 000070              2662262              2628346
> 000080              2490688              2668926
> 000090              2631587              2643791
> 0000a0
> 
> So my question is how I can find out if my LAST_ID is fine?

Above you deactivated OST0010 (hex), so OST-16 in decimal (counting starts 
with zero). That should be 2490688 then.

I still wonder if we could convince e2fsck to set that last_id value on the 
OST itself. It already can correct the wrong last_id value, but it sets that 
to the last_id it finds on disk 
(https://bugzilla.lustre.org/show_bug.cgi?id=22734). Setting it to the MDS 
value should also work, but firstly for sanity reasons it falls back to the on 
disk value, if the values differ too much (10000) and secondly I figured out 
with those patches there, that using the MDS value is broken (and did not get 
broken by patches, but my patches revealed it...). 

Cheers,
Bernd


-- 
Bernd Schubert
DataDirect Networks



More information about the lustre-discuss mailing list