[Lustre-discuss] Data corruption?
    Nirmal Seenu 
    nirmal at fnal.gov
       
    Fri Mar 27 08:21:09 PDT 2009
    
    
  
 >> I recently moved the MDT to a different partition.
 >
 > Hrm.  How did you do that?  Did you lose some information in the move
perhaps?
It was different partitions on the same machine. I did the following 
from a LVM snapshot of the old MDT which got mounted as ldiskfs
(tar --sparse -cf - . | ( cd /mnt/new-mdt; tar -xf -)) &
and then brought the entire Lustre filesystem down and did a final rsync 
after mounting both the old and new MDT as ldiskfs:
rsync -aSv /mnt/mdt/ /mnt/new-mdt
and then I did a getfattr, setfattr and "rm OBJECTS/* CATALOGS"
 >> I performed a lfsck
 >> on the the file system and got these errors from lfsck:
 >>
 >> /aa/bb/cc/dd/xx/cosmology.c object 8024 not created
 > Like just that one or lots of them?  I'd think you should see one for
 > that file you reference above: cart_wengen.odd_even.txt
cart_wengen.odd_even.txt is a new file that was created by a user after 
the lustre filesystem was brought online after performing a lfsck. i.e. 
I don't have an entry for cart_wengen.odd_even.txt in lfsck output, but 
the "ls -l" output lists the same file with ?.
During lfsck, 6612 files had the error "object not created"
On a "ls -lR" output on the lustre client, I have 6774 files that have ? 
in their output.
All our OSTs are healthy RAID6 volumes on SATABeasts and there is no 
known hardware problem. There are no errors reported When I do a e2fsck 
against each individual filesystem.
Not much I/O has happened since the last time I did a lfsck(last night). 
Would another lfsck actually help in fixing this problem?
Thanks
Nirmal
    
    
More information about the lustre-discuss
mailing list