[lustre-discuss] problem with orphan files on OSTs

John Dubinski dubinski at cita.utoronto.ca
Wed Mar 16 08:27:03 PDT 2016


We've recently set up a new lustre filesystems with version 2.5.3.90 based on ldiskfs but have clients that are still running 1.8.9 - we're hoping to upgrade them shortly but things appeared to be working normally with them.

We've been moving data to the new system as new nodes were being added.  The first few OSTs were filling up so I deactivated them in the usual way with "lctl --device XX deactivate".  

After we completed the migration, I tried to rebalance the distribution of files simply by rsync'ing a directory containing a large number of files to a copy, viz.

 rsync -a ARCHIVE/ ARCHIVE-COPY

and then deleting the original and moving the copied directory back to the old name. The copy worked without a problem.

However, I used a 1.8.9 client to do the deletion and although the files links were removed from the MDT it looks like the files on the OSTs are still there and were not deleted.  The lfs quota system still detects these lost files as well.  (Is this a bug in using a 1.8.9 client with a 2.5.3 system?)

I ran lfsck on the MDT according to the 2.5.x instructions:

 lfs lfsck_start -M xi-MDT0000

and this didn't clean things up.  I also tried:

 lfs lfsck_start -t layout -M xi-MDT0000

which seems to be the right command for deleting unlinked orphans on OSTs according to the docs but the command failed to start with "unknown Error 524".

Is the "-t layout" option only available in version 2.7.0?  Will I have to upgrade in order to clean up the OSTs?  What's the correct course of action to clean things up?  

The system seems to working normally otherwise but there is a relatively large amount of orphaned data which I would like to delete from the OSTs.

Thanks,
John



More information about the lustre-discuss mailing list