[Lustre-discuss] disappeared data from OST

Herbert Fruchtl herbert.fruchtl at st-andrews.ac.uk
Mon Feb 15 13:08:01 PST 2010


Hi there,

We have 27TB data on 6 OSTs distributed over 3 OSSes. Lustre version 1.6.7.2 on 
CentOS 4.6.

After a power spike this weekend that crashed several machines (not the 
OSS'es...) and/or possibly hitting 100% file space usage on one of them (we have 
been dangerously close for a while), it hung this morning. After restarting, it 
showed many files as missing. I decided to unmount them all and do an fsck.

I unmonted the file system from the MDS, logged in to the OSSes and started 
unmounting the OSTs. This went OK on two of the three, but on the third one, the 
umount command hangs with an error message that has something with _BUG in it (I 
can look it up tomorrow, if I still have it on the screen; I'm at home now). 
Worryingly, if I do a "df" on that machine, I get 3% file usage:
[root at oss1 ~]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda5            236062880   5911252 218160312   3% /
/dev/sda1               101086     10993     84874  12% /boot
none                   1803084         0   1803084   0% /dev/shm
/dev/sdb             236062880   5911252 218160312   3% /mnt/oss1-ost1
/dev/sdc             236062880   5911252 218160312   3% /mnt/oss1-ost2

It should be 98% or thereabouts! Now I am afraid that if I carry on (probably 
just cycling the power, since "reboot" also hangs), it will come back in the 
same state, i.e. 95% of the data gone. Is this already irreparably the case, or 
am I just paranoid?

Any suggestions would be appreciated (in other words: HELP!!!!).

Before this, I had tried an "lfsck -c -l -f" on the mounted file system, but the 
sudden drop in disk usage on oss1 definitely only happened after I killed this 
and tried to umount by hand.

Cheers,

   Herbert
-- 
Herbert Fruchtl
Senior Scientific Computing Officer
School of Chemistry, School of Mathematics and Statistics
University of St Andrews
--
The University of St Andrews is a charity registered in Scotland:
No SC013532



More information about the lustre-discuss mailing list