[lustre-discuss] Data migration from one OST to anther
Patrick Farrell
pfarrell at whamcloud.com
Sun Mar 3 06:09:32 PST 2019
Hsieh,
This sounds similar to a bug with pre-2.5 servers and 2.7 (or newer) clients. The client and server have a disagreement about which does the delete, and the delete doesn’t happen. Since you’re running 2.5, I don’t think you should see this, but the symptoms are the same. You can temporarily fix things by restarting/remounting your OST(s), which will trigger orphan cleanup. But if that works, the only long term fix is to upgrade your servers to a version that is expected to work with your clients. (The 2.10 maintenance release is nice if you are not interested in the newest features, otherwise, 2.12 is also an option.)
I would also recommend where possible that you keep clients and servers in sync - we do interop testing, but same version on both is much more widely used.
- Patrick
________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Tung-Han Hsieh <thhsieh at twcp1.phys.ntu.edu.tw>
Sent: Sunday, March 3, 2019 4:00:17 AM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] Data migration from one OST to anther
Dear All,
We have a problem of data migration from one OST two another.
We have installed Lustre-2.5.3 on the MDS and OSS servers, and Lustre-2.8
on the clients. We want to migrate some data from one OST to another in
order to re-balance the data occupation among OSTs. In the beginning we
follow the old method (i.e., method found in Lustre-1.8.X manuals) for
the data migration. Suppose we have two OSTs:
root at client# /opt/lustre/bin/lfs df
UUID 1K-blocks Used Available Use% Mounted on
chome-OST0028_UUID 7692938224 7246709148 55450156 99% /work[OST:40]
chome-OST002a_UUID 14640306852 7094037956 6813847024 51% /work[OST:42]
and we want to migrate data from chome-OST0028_UUID to chome-OST002a_UUID.
Our procedures are:
1. We deactivate chome-OST0028_UUID:
root at mds# echo 0 > /opt/lustre/fs/osc/chome-OST0028-osc-MDT0000/active
2. We find all files located in chome-OST0028_UUID:
root at client# /opt/lustre/bin/lfs find --obd chome-OST0028_UUID /work > list
3. In each file listed in the file "list", we did:
cp -a <file> <file>.tmp
mv <file>.tmp <file>
During the migration, we really saw that more and more data written into
chome-OST002a_UUID. But we did not see any disk release in chome-OST0028_UUID.
In Lustre-1.8.X, doing this way we did saw that chome-OST002a_UUID has
more data coming in, and chome-OST0028_UUID has more and more free space.
It looks like that the data files referenced by MDT have copied to
chome-OST002a_UUID, but the junks still remain in chome-OST0028_UUID.
Even though we activate chome-OST0028_UUID after migration, the situation
is still the same:
root at mds# echo 1 > /opt/lustre/fs/osc/chome-OST0028-osc-MDT0000/active
Is there any way to cure this problem ?
Thanks very much.
T.H.Hsieh
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190303/f987ba39/attachment.html>
More information about the lustre-discuss
mailing list