[lustre-discuss] Robinhood scan time

Stephane Thiell sthiell at stanford.edu
Mon Dec 7 09:42:50 PST 2020


Hi Amit,

Your number is very low indeed.

At our site, we're seeing ~100 million files/day during a Robinhood scan with nb_threads_scan =4 and on hardware using Intel based CPUs:

2020/11/16 07:29:46 [126653/2] STATS |      avg. speed  (effective):   1207.06 entries/sec (3.31 ms/entry/thread)

2020/11/16 07:31:44 [126653/29] FS_Scan | Full scan of /oak completed, 1508197871 entries found (65 errors). Duration = 1249490.23s

In that case, our Lustre MDS and Robinhood server are running all on 2 x CPU E5-2643 v3 @ 3.40GHz.
The Robinhood server has 768GB of RAM and 7TB of SSDs in RAID-10 for the DB.

On another filesystem, using AMD Naples -based CPUs and a dedicated Robinhood DB, hosted a different server with AMD Rome CPUs, we’re seeing a rate of 266M/day during a Robinhood scan with nb_threads_scan = 8:

2020/09/20 21:43:46 [25731/4] FS_Scan | Full scan of /fir completed, 877905438 entries found (744 errors). Duration = 284564.88s


Best,

Stephane

> On Dec 7, 2020, at 4:49 AM, Degremont, Aurelien <degremoa at amazon.com> wrote:
> 
> Hi Amit,
> 
> Thanks for this data point, that's interesting.
> Robinhood prints a scan summary in its logfile at the end of scan. It could be nice if you can copy/paste it, for further reference.
> 
> Aurélien
> 
> Le 04/12/2020 23:39, « lustre-discuss au nom de Kumar, Amit » <lustre-discuss-bounces at lists.lustre.org au nom de ahkumar at mail.smu.edu> a écrit :
> 
>    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
>    Dual Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz;
>    256GB RAM
>    System x3650 M5
>    Storage for MDT is from NetApp EF560.
> 
>    Best regards,
>    Amit
> 
>    -----Original Message-----
>    From: Russell Dekema <dekemar at umich.edu>
>    Sent: Friday, December 4, 2020 4:27 PM
>    To: Kumar, Amit <ahkumar at mail.smu.edu>
>    Cc: lustre-discuss at lists.lustre.org
>    Subject: Re: [lustre-discuss] Robinhood scan time
> 
>    Greetings,
> 
>    What kind of hardware are you running on your metadata array?
> 
>    Cheers,
>    Rusty Dekema
> 
>    On Fri, Dec 4, 2020 at 5:12 PM Kumar, Amit <ahkumar at mail.smu.edu> wrote:
>> 
>> HI All,
>> 
>> 
>> 
>> During LAD’20 Andreas mentioned if I could share the Robinhood scan time for the 369millions files we have. So here it is. It took ~23 days for me to complete initial scan of all 369 million files, on a dedicated robinhood server that has 384GB RAM. I had it setup with all tweaks for database and client that was mentioned in Robinhood document. I only used 2 threads for this scan. Hope this reference helps.
>> 
>> 
>> 
>> Thank you,
>> 
>> Amit
>> 
>> 
>> 
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 
>    ----IF CLASSIFICATION START----
> 
>    ----IF CLASSIFICATION END----
>    _______________________________________________
>    lustre-discuss mailing list
>    lustre-discuss at lists.lustre.org
>    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list