[lustre-discuss] OST CONFIG/*-client damaged

Tung-Han Hsieh thhsieh at twcp1.phys.ntu.edu.tw
Tue Feb 15 02:31:32 PST 2022


Dear All,

We encounter a problem to mount a damaged OST partition, as described
below.

The OST partition suffered serious hard disk damage, which was sent to
a data rescue company to try to recover the data as much as possible.
After that, we run 

	tunefs.lustre --writeconf /dev/<device_name>

to clean logs for all MGT, MDT, and OST, and try to mount the Lustre
file system. But the damaged OST partition cannot be mounted, with
the following error message:

-----------------------
mount.lustre: mount /dev/<dev> at /Lustre/ost failed: Invalid argument
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.
-----------------------


The dmesg message of OST server has the following error:

-----------------------
LustreError: 157-3: Trying to start OBD lfs2-OST0006-UUID using the wrong disk. Were the /dev/ assignments rearranged?
LustreError: 36047:0:(obd_config.c:559:class_setup()) setup lfs2-OST0006 failed (-22)
LustreError: 36047:0:(obd_config.c:1835:class_config_llog_handler()) MGC172.16.31.231 at o2ib: cfg command failed: rc = -22
Lustre:     cmd=cf003 0:lfs2-OST0006  1:dev  2:0  3:f
LustreError: 15b-f: MGC172.16.31.231 at o2ib: The configuration from log 'lfs2-OST0006' failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre.
LustreError: 36034:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server lfs2-OST0006: -22
LustreError: 36034:0:(obs_mount_server.c:1939:server_fill_super()) Unable to start targets: -22
LustreError: 36034:0:(obd_config.c:610:class_cleanup()) Device 12 not setup
Lustre: server umount lfs2-OST0006 complete
LustreError: 36034:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount <dev> (-22)
-----------------------


The dmesg in MGS/MDT server has the following error:

-----------------------
Lustre: MGS: Regenerating lfs2-OST0006 log by user request: rc = 0
Lustre: Found index 6 for lfs2-OST0006, updating log
Lustre: Client log for lfs2-OST0006 was not updated; writeconf the MDT first to regenerate it.
-----------------------

We mount the good OST and bad OST with ldiskfs, and compare the files
found in each partition. We found the following discrepancy:

-rw-r--r-- 1 root root 60656 Aug 31 16:10  /mnt/bad_OST/CONFIGS/lfs2-client
-rw-r--r-- 1 root root 75416 Aug 31 16:10  /mnt/good_OST/CONFIGS/lfs2-client

So we suspect that, after the hard works of the data rescue company,
the /CONFIGS/lfs2-client file of the bad OST was not successfully
recovered, which leads to the problem.

Here is the question: Is it possible to regenerate this file ? On the
other hand, is there other tips we missed for system recovery ?

Any suggestion is very appreciated.

Thank you very much.


Best Regards,

T.H.Hsieh


More information about the lustre-discuss mailing list