[Lustre-discuss] orphaned objects on OSTs

Michael Barnes Michael.Barnes at jlab.org
Tue Nov 9 08:06:52 PST 2010


On Nov 8, 2010, at 12:18 AM, Andreas Dilger wrote:

> On 2010-11-05, at 07:30, Michael Barnes wrote:
>> I'm guessing that the MDS simply doesn't know about the dangling object on the OST.  On one OST that I've examined I noticed that most of the orphaned objects are from Oct 27, and somewhere between Oct 26th and 27th the metadata server had a software crash (1.8.1.1 since upgraded to 1.8.4).  I'm guessing that the clients pushed data directly to the OST and could not update the MDT, thus leaving the stray files on the OST.
>> 
>> Is there any way to get more information about the files or possibly clean these files while lustre is active?  All I have are the object ID and regular UNIX metadata information that is stored on the object files.  When these errors occurred, did the write fail on the client side, and are the users' not expecting the data to be there?
> 
> You can find out which MDS inode these objects belong(ed) to with the ll_decode_filter_fid tool, included in newer lustre releases.  It will print out the MDS inode and generation numbers that are saved on the OST objects when they are first accessed.  On the MDS you can use debugfs with the "stat <inode_nr>" command to determine if the inode is still in use (the generation number would match the one on the OST, otherwise it is just a re-used inode.

Thanks for the tip.  BTW, what are the arguments to ll_decode_filter_fid?  I've given it an objid and a filename, and neither seem to do anything.

My suspicions were confirmed by a user yesterday.  These orphaned objects are not isolated to MDS/OSS failures, they are reproducible with a user's data summary script.

These users are dealing with a moderate amount of data in terms of size, but they have many, many small files.  I've seen hundreds of thousands of these files go missing.  The user told me the arguments to his script and told me I could run it, and this script definitely generates orphaned files every time its run.


########################################################################################

Some basic info:

clients are at 1.8.1.1 and 1.8.2 (with a hang patch)
mds/oss are at 1.8.4 originally at 1.8.1.1

~1000 clients
~200 TB filesystem

OSSes are on DDR infiniband
MDS is on QDR infiniband
clients are on anywhere between SDR to QDR infiniband, no TCP clients at this time AFAIK


Some OSTs are turned off/deactivated (conf_param version).
Some OSTs are turned off/deactivated (set_param version).


########################################################################################

A junk file I just created looks like this from lfs getstripe from the client:

...
46: lustre-OST002e_UUID INACTIVE
...
tmp.px0py0pz0_phi_jr_Nsrc3_Ncfg50_20x64_m020m050_P.dat
	obdidx		 objid		objid		 group
	    46	        207061	      0x328d5	             0



*** This does not make sense to me why the client sees the OST as inactive yet is still is allocating objects to it. This OST should be active ***

On the mds lctl dl shows:

...
 51 UP osc lustre-OST002e-osc lustre-mdtlov_UUID 5
...

The client's syslog has many messages similar to:

Nov  9 10:24:44 qcd10i2.jlab.org kernel: LustreError: 30129:0:(file.c:1001:ll_glimpse_size()) Skipped 3716 previous similar messages 
Nov  9 10:33:01 qcd10i2.jlab.org kernel: LustreError: 6460:0:(namei.c:1160:ll_objects_destroy()) obd destroy objid 0xb02800a at 0x0 error -5 
Nov  9 10:33:01 qcd10i2.jlab.org kernel: LustreError: 6460:0:(namei.c:1160:ll_objects_destroy()) Skipped 3203 previous similar messages 
Nov  9 10:33:27 qcd10i2.jlab.org kernel: LustreError: 6410:0:(file.c:125:ll_close_inode_openhandle()) inode 184713223 ll_objects destroy: rc = -5 
Nov  9 10:41:36 qcd10i2.jlab.org kernel: LustreError: 6626:0:(file.c:1001:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO 
Nov  9 10:41:36 qcd10i2.jlab.org kernel: LustreError: 6626:0:(file.c:1001:ll_glimpse_size()) Skipped 21 previous similar messages 
Nov  9 10:43:15 qcd10i2.jlab.org kernel: LustreError: 6897:0:(namei.c:1160:ll_objects_destroy()) obd destroy objid 0x19188a1e at 0x0 error -5 
Nov  9 10:43:15 qcd10i2.jlab.org kernel: LustreError: 6897:0:(namei.c:1160:ll_objects_destroy()) Skipped 3 previous similar messages 


########################################################################################

Some more background info.

We have had many OST and a few MDS failures.  More hardware, which in turn has tickled some software bugs which seemed to of lessened by upgrading servers to 1.8.4.

We are changing our RAID configurations, and migrating data off of OSTs.  This is why we have OSTs offline/deactivated.

Any help would be appreciated.

TIA,

-mb

--
+-----------------------------------------------
| Michael Barnes
|
| Thomas Jefferson National Accelerator Facility
| Scientific Computing Group
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------







More information about the lustre-discuss mailing list