[Lustre-discuss] [HPDD-discuss] Unable to move MDS using procedure in the manual

Dilger, Andreas andreas.dilger at intel.com
Tue Jun 4 11:02:56 PDT 2013


On 2013/04/06 11:44 AM, "Ken Hornstein" <kenh at cmf.nrl.navy.mil> wrote:

>I tried to move my MDS from one filesystem on the same machine to another,
>using the procedure outlined in the Lustre manuals (I didn't use dd, since
>the underlying disks weren't the same size and also I did not think
>it was required).

What version of Lustre is this?

>Specifically, I used rsync to copy the files, and also used
>getfattr/setfattr
>to copy over the extended attributes.  Some brief poking around seemed to
>show that the EA information made it into the new filesystem.

File-level backups are not supported with Lustre 2.1 and 2.2.  You either
need to use "dd" for backup/restore, or use Lustre 2.3.0 or later with the
"LFSCK OI Scrub" functionality.  Lustre 2.4.0 is preferred (especially if
this is for testing purposes) since it can do proper rebuilding of the
FID-in-dirent and LinkEA attributes on the MDT.

>However, when I went to mount the "new" MDS partition, it failed with the
>following error:
>
>May 30 23:36:50 mds-foo kernel: [  186.604083] LustreError:
>3082:0:(md_local_object.c:433:llo_local_objects_setup()) creating obj
>[fld] fid = [0x200000001:0x3:0x0] rc = -116
>May 30 23:36:50 mds-foo kernel: [  186.698205] LustreError:
>3082:0:(mdt_handler.c:4576:mdt_init0()) Can't init device stack, rc -116
>May 30 23:36:50 mds-foo kernel: [  186.797206] LustreError:
>3082:0:(obd_config.c:522:class_setup()) setup foo-MDT0000 failed (-116)
>May 30 23:36:50 mds-foo kernel: [  186.806140] LustreError:
>3082:0:(obd_config.c:1363:class_config_llog_handler()) Err -116 on cfg
>command:
>May 30 23:36:50 mds-foo kernel: [  186.815615] Lustre:    cmd=cf003
>0:foo-MDT0000  1:foo-MDT0000_UUID  2:0  3:foo-MDT0000-mdtlov  4:f
>
>There were more errors, bu they all pretty much were cascading from these
>errors.  I switched back to the original filesystem and everything worked.
>
>I am willing to believe I did something wrong, but I'm not sure what; I
>did everything the directions said to do.  -116 is ESTALE, and I found
>in the code where I believe that error was returned, but it was a little
>unclear to me what the root cause was.  Can anyone offer any advice?
>
>--Ken
>_______________________________________________
>HPDD-discuss mailing list
>HPDD-discuss at lists.01.org
>https://lists.01.org/mailman/listinfo/hpdd-discuss
>


Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division





More information about the lustre-discuss mailing list