[Lustre-discuss] lfs_migrate question

Dilger, Andreas andreas.dilger at intel.com
Sat Oct 20 12:31:59 PDT 2012


On 2012-10-18, at 16:11, Jason Brooks <brookjas at ohsu.edu<mailto:brookjas at ohsu.edu>> wrote:

I suffered an oss crash where my oss server had a cpu fault.  I have it running again, but I am trying to decommission it.  I am migrating the data off of it onto other ost's using the lfs find command with lfs_migrate.

It's been nearly 36 hours and about 2 terabytes have been moved.  This means I am about halfway.  Is this a decent rate?

Depends on how large your files are, and how fast the network is, but I wouldn't call it outstanding...

Here are the particulars, which basically are snags.  I know they affect things, I just am not certain to what degree:

  1.  I am running lfs_migrate on two systems, migrating different subdirectories of the same mount point.

This increases contention on the MDS, but two clients shouldn't be overloading the server.  Presumably you are only finding and migrating files which are striped over the affected server?

  1.  All systems are running using ip over infiniband.

IPoIB is far slower than native IB, both for data and metadata, but in the middle if migration is probably not the time to be messing with your network configuration.

  1.  None of my client-only systems have lfs or lfs_migrate.  I think this is because they are ubuntu and only the lustre kernel modules are installed.  Thus I can't run it there.

This is just a shell script, so you could have copied it from another mode.

  1.   Oh, and that also means that the lustre filesytem is mounted on the oss's too.

This is not an ideal situation, since the memory usage on the client is competing with the memory of the OSS.

  1.  lfs_migrate and lfs did not seem to operate correctly on the oss's that are 1.8.6.  Works ok on 1.8.8 though.

Can't really comment based on this limited information.

  1.  AND the two systems I am running lfs_migrate on are probably the very systems with free ost space on them.  In other words, file blocks are being written to the very systems that lfs_migrate is being run on and/or there is a lot of block write traffic between the two.


Lustre versions:
Mds/mgs: 1.8.6
5 of 7 OSS's: 1.8.6
2 of 7 oss's: 1.8.8

Clients: 1.8.6, ubuntu.


_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org>
http://lists.lustre.org/mailman/listinfo/lustre-discuss


More information about the lustre-discuss mailing list