[lustre-discuss] File migration from one OST to another for Lustre 2.5

Bob Ball ball at umich.edu
Wed Jun 24 08:56:56 PDT 2015


So, let's say I just want to empty the OST completely, reformat or 
remake or...., then add the OST back in.  If I never let the emptied 
OST, prior to re-X, reconnect, then the MDS cannot destroy the objects 
on the OST.

Is this going to be an issue once the re-X OST is ready and is brought 
back online, and re-enabled for write on the MDS?  Of course, the re-X 
will have destroyed those objects already.  But I have gotten paranoid 
about items such as this in my old age.

Thanks,
bob

On 6/24/2015 3:52 AM, Dilger, Andreas wrote:
> On 2015/06/24, 1:39 PM, "Tung-Han Hsieh" <thhsieh at twcp1.phys.ntu.edu.tw>
> wrote:
>
>> Dear All,
>>
>> I have a question about data file migration from one OST to anther
>> for Lustre 2.5.
>>
>> Suppose that OST0000 is going to be removed. In the past (I mean,
>> Lustre-1.8.7), I can do the migration smoothly via the following
>> procedure:
>>
>> 1. In MDT server, stop writing new files to OST0000:
>>
>> 	lctl set_param osc.foo-OST0000-osc.active=0
>>
>> 2. Find out all files located in OST0000 (suppose that the lustre
>>    filesystem is mounted at /home):
>>
>> 	lfs find --obd foo-OST0000_UUID /home > OST0000.txt
>>
>> 3. For each file listed in "OST0000.txt", I did:
>>
>> 	cp -a file file.tmp
>> 	mv file.tmp file
>>
>>    So each file in OST0000 will be replaced by a new copy created
>>    in other OSTs. If I login into the OST0000 server and use df
>>    to check, I can see that the disk usage of the OST0000 partition
>>    is fewer and fewer. Eventually it will go to nearly empty.
>>
>> 4. Finally, remove OST0000 permanently:
>>
>> 	lctl conf_param osc.foo-OST0000-osc.active=0
>>
>>
>> But for Lustre 2.5, when I repeated the above procedures 1 to 3
>> (I haven't done step 4), the disk usage of OST0000 still remains
>> the same, while the disk usage of the other OSTs keep increasing.
> Correct. This is a change that has been introduced since Lustre 2.4 in the
> way that OST objects are destroyed.  The MDS is now in charge of OST
> object destroys, so if its OST connection is disabled then the OST objects
> will not be destroyed until the MDS connects to the OST again.  Please see
> LU-5931 for more details.
>
>
>> Then I interrupt step 3, and check step 2 again, I did see that
>> many files were disappeared from OST0000. So the migration works.
>> But since the disk usage of OST0000 did not change at all, it means
>> that the migrated files still left the "bodies" in OST0000. They
>> become junks in OST0000.
> Which is fine if you are removing OST0000 from the filesystem permanently.
>   If you aren't removing it permanently, it will be fixed up when the MDS
> reconnects to it again.
>
>> Is it the normal behavior of Lustre 2.5?
> Yes, though it could be improved.
>
>> If I want it to behave exactly the same as Lustre 1.8.7, how should I do ?
> Sorry, that isn't possible today.
>
> Cheers, Andreas



More information about the lustre-discuss mailing list