[lustre-discuss] Migrating files doesn't free space on the OST

Alexander I Kulyavtsev aik at fnal.gov
Thu Jan 17 09:56:08 PST 2019

- you can re-run command to find files residing on ost to see if files are new or old.

- zfs may have snapshots if you ever did snapshots; it takes space.

- removing data or snapshots has some lag to release the blocks (tens of minutes) but I guess that is completed by now.

- there are can be orphan objects on OST if you had crashes. On older lustre versions if the ost was emptied out you can mount underlying fs as ext4 or zfs; set mount to readonly and browse ost objects - you may see if there are some orphan objects left. On newer lustre releases you probably can run lfsck (lustre scanner).

- to find what hosts / jobs currently writing to lustre you may enable lustre jobstats; clear counters and parse stats files in /proc . There was xltop tool on github for older versions of lustre not having implemented jobstats but it was not updated for a while.

- depending on lustre version you have the implementation of lfs migrate is different. The older version copied file with other name to other ost, renamed files and removed old file. If migration done on file open for write by application the data will not be released until file closed (and data in new file are wrong). Recent implementation of migrate does swap of the file objects with file layout lock taken. I can not tell if it is safe for active write.

- not releasing space can be a bug - did you check jira on whamcloud? What version of lustre do you have? Is it ldiskfs or zfs based? zfs version?


I am trying to migrate files I know are not in use off of the full OST that I have using lfs migrate.  I have verified up and down that the files I am moving are on that OST and that after the migrate lfs getstripe indeed shows they are no longer on that OST since it's disabled in the MDS.

The problem is, the used space on the OST is not going down.

I see one of at least two issues:

- the OST is just not freeing the space for some reason or another ( I don't know)

- Or someone is writing to existing files just as fast as I am clearing the data (possible, but kind of hard to find)

Is there possibly something else I am missing? Also, does anyone know a good way to see if some client is writing to that OST and determine who it is if it's more probable that that is what is going on?

