[lustre-discuss] Slow release of inodes on OST

Åke Sandgren ake.sandgren at hpc2n.umu.se
Sat Feb 8 00:47:04 PST 2020


The filesystems are completely idle during this. It's a test setup where
I'm running io500 and doing nothing else.

I set
osp.rsos-OST0000-osc-MDT0000.max_rpcs_in_flight=512
osp.rsos-OST0000-osc-MDT0000.max_rpcs_in_progress=32768
which severely reduced my waiting time between runs.
The in_progress being the one that actually affected things.

On 2/8/20 4:50 AM, Andreas Dilger wrote:
> I haven't looked at that code recently, but I suspect that it is waiting
> for journal commits to complete
> every 5s before sending another batch of destroys?  Is the filesystem
> otherwise idle or something?
> 
> 
>> On Feb 7, 2020, at 02:34, Åke Sandgren <ake.sandgren at hpc2n.umu.se
>> <mailto:ake.sandgren at hpc2n.umu.se>> wrote:
>>
>> Loocking at the osp.*.sync* values i see
>> osp.rsos-OST0000-osc-MDT0000.sync_changes=14174002
>> osp.rsos-OST0000-osc-MDT0000.sync_in_flight=0
>> osp.rsos-OST0000-osc-MDT0000.sync_in_progress=4096
>> osp.rsos-OST0000-osc-MDT0000.destroys_in_flight=14178098
>>
>> And it takes 10 sec between changes of those values.
>>
>> So is there any other tunable I can tweak on either OSS or MDS side?
>>
>> On 2/6/20 6:58 AM, Andreas Dilger wrote:
>>> On Feb 4, 2020, at 07:23, Åke Sandgren <ake.sandgren at hpc2n.umu.se
>>> <mailto:ake.sandgren at hpc2n.umu.se>
>>> <mailto:ake.sandgren at hpc2n.umu.se>> wrote:
>>>>
>>>> When I create a large number of files on an OST and then remove them,
>>>> the used inode count on the OST decreases very slowly, it takes several
>>>> hours for it to go from 3M to the correct ~10k.
>>>>
>>>> (I'm running the io500 test suite)
>>>>
>>>> Is there something I can do to make it release them faster?
>>>> Right now it has gone from 3M to 1.5M in 6 hours, (lfs df -i).
>>>
>>> It this the object count or the file count?  Are you possibly using a
>>> lot of
>>> stripes on the files being deleted that is multiplying the work needed?
>>>
>>>> These are SSD based OST's in case it matters.
>>>
>>> The MDS controls the destroy of the OST objects, so there is a rate
>>> limit, but ~700/s seems low to me, especially for SSD OSTs.
>>>
>>> You could check "lctl get_param osp.*.sync*" on the MDS to see how
>>> many destroys are pending.  Also, increasing osp.*.max_rpcs_in_flight
>>> on the MDS might speed this up?  It should default to 32 per OST on
>>> the MDS vs. default 8 for clients
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Principal Lustre Architect
>>> Whamcloud
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> -- 
>> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
>> Internet: ake at hpc2n.umu.se <mailto:ake at hpc2n.umu.se>   Phone: +46 90
>> 7866134 Fax: +46 90-580 14
>> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
>> <http://www.hpc2n.umu.se/>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Lustre Architect
> Whamcloud
> 
> 
> 
> 
> 
> 

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: ake at hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se


More information about the lustre-discuss mailing list