[Lustre-discuss] File Content change without Error log

Lu Wang wanglu at ihep.ac.cn
Wed Apr 1 02:56:53 PDT 2009


Dear list, 
 	When I trying to remount the OST (after mkfs.lustre --reformat )I got these errors:

on prompt line:
 mount -t lustre /dev/sda /lustre/ost1
mount.lustre: mount /dev/sda at /lustre/ost1 failed: Address already in use

In log:
Apr  1 17:46:40 boss10 kernel: LustreError: 11-0: an error occurred while communicating with 192.168.50.32 at tcp. The mgs_target_reg operation failed with -98
Apr  1 17:46:40 boss10 kernel: LustreError: 2204:0:(obd_mount.c:1084:server_start_targets()) Required registration failed for besfs-OST0014: -98
Apr  1 17:46:40 boss10 kernel: LustreError: 2204:0:(obd_mount.c:1623:server_fill_super()) Unable to start targets: -98
Apr  1 17:46:40 boss10 kernel: LustreError: 2204:0:(obd_mount.c:1406:server_put_super()) no obd besfs-OST0014
Apr  1 17:46:40 boss10 kernel: LustreError: 2204:0:(obd_mount.c:136:server_deregister_mount()) besfs-OST0014 not registered
Apr  1 17:46:40 boss10 kernel: LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
Apr  1 17:46:40 boss10 kernel: LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
Apr  1 17:46:40 boss10 kernel: LDISKFS-fs: mballoc: 0 generated and it took 0
Apr  1 17:46:40 boss10 kernel: LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
Apr  1 17:46:40 boss10 kernel: Lustre: server umount besfs-OST0014 complete
Apr  1 17:46:40 boss10 kernel: LustreError: 2204:0:(obd_mount.c:1980:lustre_fill_super()) Unable to mount  (-98)

I use the old index of the OST. The OSTs are still Inactive in the system. Need I reactivate it first? 
------------------				 
Lu Wang
2009-04-01

-------------------------------------------------------------
发件人:Brian J. Murrell
发送日期:2009-04-01 01:25:46
收件人:lustre-discuss
抄送:
主题:Re: [Lustre-discuss] File Content change without Error log

On Wed, 2009-04-01 at 01:24 +0800, Lu Wang wrote:
> I think data in the "good" OST may also be demaged, so I decide to delete all files on these two OSTs. 

Probably the safest thing to do.

> By the way, when I unlink a file, there is a "Input/Output error" , however the file disappears. 
>  #unlink run_0005818_Any_file007_SFO-1.rec
> unlink: cannot unlink `run_0005818_Any_file007_SFO-1.rec': Input/output error
> # ll run_0005818_Any_file007_SFO-1.rec
> ls: run_0005818_Any_file007_SFO-1.rec: No such file or directory
> 
> I am not sure the file is saftely delete or not. Any suggustion?  

This is just speculation without having any evidence one way or another,
but just likely due to the damaged OSTs.

You can use lfs find to find all files that had objects on those two
OSTs and try to clean them up, or you can simply replace the OSTs with
fresh ones and let lfsck sort it all out.  It will correct the
situations where files exist on the MDT but the objects on the OSTs are
missing.

b.


_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list