[lustre-discuss] Question about max service threads

Mon Sep 27 01:59:15 PDT 2021

I ran some experiments in the last week and verified that both methods are equivalent:D Thus, I think the performance difference was not due to the increase of parameter mds.MDS.mdt.threads_max but something else. 

However, by recording the Lustre thread changes in my experiments, I notice the ost.OSS.ost_io.threads_started reaches max, which should be a promising parameter to tune. I will do more experiments to verify my assumption:D

Thank you so much for the help!

Best regards,
Houkun

> On 22. Sep 2021, at 23:43, Andreas Dilger <adilger at whamcloud.com> wrote:
> 
> Both methods should produce equivalent numbers.  On my 2.14.0 system:
> 
> # ps auxww | grep mdt0
> root      594183  0.0  0.0      0     0 ?        I    Sep09   0:41 [mdt00_000]
> root      594184  0.0  0.0      0     0 ?        I    Sep09   0:34 [mdt00_001]
> root      594185  0.0  0.0      0     0 ?        I    Sep09   0:37 [mdt00_002]
> root      594288  0.0  0.0      0     0 ?        I    Sep09   0:36 [mdt00_003]
> root      594664  0.0  0.0      0     0 ?        I    Sep09   0:25 [mdt00_004]
> root      594665  0.0  0.0      0     0 ?        I    Sep09   0:32 [mdt00_005]
> root      594667  0.0  0.0      0     0 ?        I    Sep09   0:41 [mdt00_006]
> root      594668  0.0  0.0      0     0 ?        I    Sep09   0:30 [mdt00_007]
> root      594670  0.0  0.0      0     0 ?        I    Sep09   0:30 [mdt00_008]
> root      594673  0.0  0.0      0     0 ?        I    Sep09   0:37 [mdt00_009]
> root      594680  0.0  0.0      0     0 ?        I    Sep09   0:40 [mdt00_010]
> # lctl get_param mds.MDS.mdt.threads_started
> mds.MDS.mdt.threads_started=11
> 
> This is the service thread for most MDT RPC requests.  Definitely increasing the number of "mdt" threads will improve metadata performance, as long as the underlying storage has IOPS for it.  Having too many threads would use more memory, and in the rare case of an HDD-based MDT this might cause excessive seeking (that is obviously not a problem for SSD/NVMe MDTs).
> 
> # ps auxww | grep mdt_rdpg
> root      594186  0.0  0.0      0     0 ?        I    Sep09   0:02 [mdt_rdpg00_000]
> root      594187  0.0  0.0      0     0 ?        I    Sep09   0:02 [mdt_rdpg00_001]
> root      594663  0.0  0.0      0     0 ?        I    Sep09   0:03 [mdt_rdpg00_002]
> # lctl get_param mds.MDS.mdt_readpage.threads_started
> mds.MDS.mdt_readpage.threads_started=3
> 
> This service is for bulk readdir RPCs.
> 
> Cheers, Andreas
> 
>> On Sep 22, 2021, at 15:16, Houkun Zhu <diskun.zhu at gmail.com <mailto:diskun.zhu at gmail.com>> wrote:
>> 
>> I’m running lustre 2.12.7. The workload I was running was generated by fio, i.e., 6 process send I/O requests to the server. As I could see the non-trivial performance difference when I increase the parameter mds.MDS.mdt.threads_max, I assume it played a role in the performance.  
>> 
>> Thanks to the tip from Patrick, I just executed command ps axu|grep mdt and got the following result,
>> 
>> root     17654  0.0  0.0      0     0 ?        S    Aug07   0:13 [mdt00_006]
>> root     18672  0.0  0.0      0     0 ?        S    Aug07   0:21 [mdt00_007]
>> root     18902  0.0  0.0      0     0 ?        S    Aug07   0:11 [mdt00_008]
>> root     23778  0.0  0.0      0     0 ?        S    Sep21   0:46 [mdt01_003]
>> root     24032  0.0  0.0      0     0 ?        S    Sep21   0:14 [mdt_rdpg01_002]
>> root     25292  0.0  0.0      0     0 ?        S    Sep21   0:34 [mdt01_004]
>> root     25293  0.0  0.0      0     0 ?        S    Sep21   0:36 [mdt01_005]
>> root     25294  0.0  0.0      0     0 ?        S    Sep21   0:35 [mdt01_006]
>> root     25295  0.0  0.0      0     0 ?        S    Sep21   0:37 [mdt01_007]
>> root     25296  0.0  0.0      0     0 ?        S    Sep21   0:36 [mdt01_008]
>> root     25297  0.0  0.0      0     0 ?        S    Sep21   0:11 [mdt_rdpg01_003]
>> root     25298  0.0  0.0      0     0 ?        S    Sep21   0:38 [mdt01_009]
>> root     25299  0.0  0.0      0     0 ?        S    Sep21   0:38 [mdt01_010]
>> root     25301  0.0  0.0      0     0 ?        S    Sep21   0:11 [mdt_rdpg01_004]
>> root     25302  0.0  0.0      0     0 ?        S    Sep21   0:37 [mdt01_011]
>> root     25303  0.0  0.0      0     0 ?        S    Sep21   0:35 [mdt01_012]
>> root     25304  0.0  0.0      0     0 ?        S    Sep21   0:11 [mdt_rdpg01_005]
>> root     25370  0.0  0.0      0     0 ?        S    Sep21   0:11 [mdt_rdpg01_006]
>> root     25375  0.0  0.0      0     0 ?        S    Sep21   0:09 [mdt_rdpg01_007]
>> root     29073  0.0  0.0      0     0 ?        S    Aug26   0:00 [mdt_rdpg00_003]
>> 
>> Could I verify my assumption by counting the number of process  mdt\d\d_\d*?
>> 
>> Best regards,
>> Houkun
>> 
>>> On 22. Sep 2021, at 21:21, Andreas Dilger <adilger at whamcloud.com <mailto:adilger at whamcloud.com>> wrote:
>>> 
>>> What version of Lustre are you running?  I tested with 2.14.0 and observed that *.*.threads_started increased and (eventually) decreased as the service threads were being used.  Note that the "*.*.threads_max" parameter is the *maximum* number of threads for a particular service (e.g. ost.OSS.ost_io.* is for bulk read/write IO operations, while ost.OSS.ost.* is for most other OST operations).  New threads are only started if the number of incoming requests in the queue exceeds the number of currently running threads, so if the requests are processed quickly and/or there are not enough clients generating RPCs, then new threads will not be started beyond the number to manage the current workload.
>>> 
>>> For example, I had reduced ost_io.threads_max=16 on my home filesystemyesterday to verify that the threads eventually stopped (that needed some ongoing IO workload until the higher-numbered threads processed a request and were the last thread running (see comment at ptlrpc_thread_should_stop() for details):
>>> 
>>> # lctl get_param ost.OSS.ost_io.threads*
>>> ost.OSS.ost_io.threads_max=16
>>> ost.OSS.ost_io.threads_min=3
>>> ost.OSS.ost_io.threads_started=16
>>> 
>>> When I increased threads_max=32 and ran a parallel IO workload on a client it increased the threads_started, but wasn't able to generate enough RPCs in flight to hit the maximum number of threads:
>>> 
>>> # lctl get_param ost.OSS.ost_io.threads*
>>> ost.OSS.ost_io.threads_max=32
>>> ost.OSS.ost_io.threads_min=3
>>> ost.OSS.ost_io.threads_started=26
>>> 
>>> On Sep 22, 2021, at 11:37, Houkun Zhu <diskun.zhu at gmail.com <mailto:diskun.zhu at gmail.com>> wrote:
>>>> 
>>>> Hi Andres,
>>>> 
>>>> Thanks a lot for your help. I actually record the parameter mds.MDS.mdt.threads_started. But its value never changes. However, I can observe the performance difference (i.e., throughput is increased tremendously) when I set a higher value of threads_max for mds. 
>>>> 
>>>> 
>>>> Best regards,
>>>> Houkun
>>>> 
>>>>> On 22. Sep 2021, at 07:21, Andreas Dilger <adilger at whamcloud.com <mailto:adilger at whamcloud.com>> wrote:
>>>>> 
>>>>> There is actually a parameter for this:
>>>>> 
>>>>> $ lctl get_param ost.OSS.*.thread*
>>>>> ost.OSS.ost.threads_max=16
>>>>> ost.OSS.ost.threads_min=3
>>>>> ost.OSS.ost.threads_started=16
>>>>> ost.OSS.ost_create.threads_max=10
>>>>> ost.OSS.ost_create.threads_min=2
>>>>> ost.OSS.ost_create.threads_started=3
>>>>> ost.OSS.ost_io.threads_max=16
>>>>> ost.OSS.ost_io.threads_min=3
>>>>> ost.OSS.ost_io.threads_started=16
>>>>> ost.OSS.ost_out.threads_max=10
>>>>> ost.OSS.ost_out.threads_min=2
>>>>> ost.OSS.ost_out.threads_started=2
>>>>> ost.OSS.ost_seq.threads_max=10
>>>>> ost.OSS.ost_seq.threads_min=2
>>>>> ost.OSS.ost_seq.threads_started=2
>>>>> 
>>>>> $ lctl get_param mds.MDS.*.thread*
>>>>> mds.MDS.mdt.threads_max=80
>>>>> mds.MDS.mdt.threads_min=3
>>>>> mds.MDS.mdt.threads_started=11
>>>>> mds.MDS.mdt_fld.threads_max=256
>>>>> mds.MDS.mdt_fld.threads_min=2
>>>>> mds.MDS.mdt_fld.threads_started=3
>>>>> mds.MDS.mdt_io.threads_max=80
>>>>> mds.MDS.mdt_io.threads_min=3
>>>>> mds.MDS.mdt_io.threads_started=4
>>>>> mds.MDS.mdt_out.threads_max=80
>>>>> mds.MDS.mdt_out.threads_min=2
>>>>> mds.MDS.mdt_out.threads_started=2
>>>>> mds.MDS.mdt_readpage.threads_max=56
>>>>> mds.MDS.mdt_readpage.threads_min=2
>>>>> mds.MDS.mdt_readpage.threads_started=3
>>>>> mds.MDS.mdt_seqm.threads_max=256
>>>>> mds.MDS.mdt_seqm.threads_min=2
>>>>> mds.MDS.mdt_seqm.threads_started=2
>>>>> mds.MDS.mdt_seqs.threads_max=256
>>>>> mds.MDS.mdt_seqs.threads_min=2
>>>>> mds.MDS.mdt_seqs.threads_started=2
>>>>> mds.MDS.mdt_setattr.threads_max=56
>>>>> mds.MDS.mdt_setattr.threads_min=2
>>>>> mds.MDS.mdt_setattr.threads_started=2
>>>>> 
>>>>> 
>>>>>> On Sep 21, 2021, at 19:21, Patrick Farrell <pfarrell at ddn.com <mailto:pfarrell at ddn.com>> wrote:
>>>>>> 
>>>>>> “
>>>>>>> Though I can wait for the number threads to automatically decrease, I didn’t find ways which can really indicate the current running threads. I’ve tried thread_started (e.g., lctl get_param mds.MDS.mdt.threads_,started). But this param doesn’t change. ”
>>>>>>> 
>>>>>>> I don’t think Lustre exposes a stat which gives *current* count of worker threads.  I’ve always used ps, grep, and wc -l to answer that question :)
>>>>>> From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org <mailto:lustre-discuss-bounces at lists.lustre.org>> on behalf of Andreas Dilger via lustre-discuss <lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>>
>>>>>> Sent: Tuesday, September 21, 2021 8:03 PM
>>>>>> To: Houkun Zhu <diskun.zhu at gmail.com <mailto:diskun.zhu at gmail.com>>
>>>>>> Cc: lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org> <lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>>
>>>>>> Subject: Re: [lustre-discuss] Question about max service threads
>>>>>> 
>>>>>> Hello Houkun,
>>>>>> There was patch https://review.whamcloud.com/34400 <https://review.whamcloud.com/34400> "LU-947 ptlrpc: allow stopping threads above threads_max" landed for the 2.13 release. You could apply this patch to your 2.12 release, or test with 2.14.0. Note that this patch only lazily stops threads as they become idle, so there is no guarantee that they will all stop immediately when the parameter is changed. It may be some time and processed RPCs before the higher-numbered threads exit. 
>>>>>> 
>>>>>> It might be possible to wake up all of the threads when the threads_max parameter is reduced, to have them check for this condition and exit. However, this is a very unlikely condition under normal usage. 
>>>>>> 
>>>>>> I would recommend to test with increasing the thread count, rather than decreasing it...
>>>>>> 
>>>>>> Cheers, Andreas
>>>>>> 
>>>>>>> On Sep 20, 2021, at 02:29, Houkun Zhu via lustre-discuss <lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> Hi guys,
>>>>>>> 
>>>>>>> I’m creating an automatic lustre performance tuning system. But I find it’s hard to tune parameter regarding  max service threads because it seems there is only guarantee of max threads when we increase the parameter. I’ve found a similar discussion from 2011, is there any updates?
>>>>>>> 
>>>>>>> Though I can wait for the number threads to automatically decrease, I didn’t find ways which can really indicate the current running threads. I’ve tried thread_started (e.g., lctl get_param mds.MDS.mdt.threads_,started). But this param doesn’t change. 
>>>>>>> 
>>>>>>> Looking forward to your help! Thank you in advance!
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Houkun
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> lustre-discuss mailing list
>>>>>>> lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
>>>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>> 
>>>>> Cheers, Andreas
>>>>> --
>>>>> Andreas Dilger
>>>>> Lustre Principal Architect
>>>>> Whamcloud
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Lustre Principal Architect
>>> Whamcloud
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
> 
> 
> 
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210927/7dc247c2/attachment-0001.html>