[Lustre-discuss] files missing after writeconf

David Gucker dgucker at choopa.com
Thu Jul 8 18:34:36 PDT 2010


When bringing up the cluster after a full powerdown, the MDS/MGS node 
was reporting the following for for each of the OSTs:

Jul  8 17:16:18 ID6317 kernel: LustreError: 13b-9: Test01-OST0000 claims 
to have registered, but this MGS does not know about it, preventing 
registration.
Jul  8 17:16:18 ID6317 kernel: LustreError: 
26184:0:(mgs_handler.c:660:mgs_handle()) MGS handle cmd=253 rc=-2

I have two OSS's and checked back to my mkfs commands and it looks like 
I forgot to enable failover in the options.  So I found that I could 
update that flag using tunefs.lustre.  Looking into that a bit I found 
that I should run it with --writeconf flag as well.

So, I unmounted the OST's and ran:
tunefs.lustre --param failover.mode=failout /dev/iscsi/ost-1.target0

on each of them.   After doing this (and maybe remounting the mds/mgs), 
I was able to mount the OSTs, and then mounted the client but all data 
was missing. The filesystem reports 11% full which is about right for 
the data that was on there but no files.

After reading the docs a bit better I found that I should have done 
things more properly (fully shutdown and unloaded the filesystem, then 
done the writeconf beginning with the mgs).  So I tried running through 
the proceedure a little better and filesystem is in the same state 
(appears to be fine, just shows used space and no files).

I was unable to recreate this in another test cluster (no data loss).   
So, I'm wondering if these files are recoverable at all?  Can anyone 
point me in the right direction, if there is one?

Dave



More information about the lustre-discuss mailing list