[lustre-discuss] Tuning for metadata performance

Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] darby.vicker-1 at nasa.gov
Tue Jan 12 10:55:08 PST 2021


No, I haven't.  By what means do you suggest analyzing the OP calls?  Just an strace? Or the server-size debug commands as outlined in https://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438274_62472 ?

We also have jobstats enabled and are outputting these to a file for later analysis.  So if I submitted this test in a slurm job, I'd get stats like:

$ grep MDT /aerolab/admin/slurm/19435076_qsmoore
	MDT:snapshot_time      : 2021-01-09 05:29:59
	MDT:setattr            : 110
	MDT:getattr            : 20908
	MDT:mkdir              : 11
	MDT:getxattr           : 20424
	MDT:mknod              : 48
	MDT:close              : 19829
	MDT:unlink             : 9
	MDT:open               : 20188
$

But you must be referring to an external tool like strace so I could do the same thing on both lustre and NFS.  

-----Original Message-----
From: Michael Di Domenico <mdidomenico4 at gmail.com>
Date: Tuesday, January 12, 2021 at 10:48 AM
To: "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.]" <darby.vicker-1 at nasa.gov>
Cc: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Subject: Re: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance

    have you run any analysis on the "A clone of these repo takes 550
    seconds on lustre", where you track the exact OP calls on lustre to
    see if it's a general slowness or if there is a specific OP that git
    is abusing?  i wonder if there's something specific that git is doing
    that lustre is unhappy with versus continuing to poke at the hardware
    or software tuning.

    thought less likely, i'd also be curious if you have any
    security/audit controls turned on on the clients.  i have some silly
    ones where i'm at that slow things down on lustre but not nfs because
    of how the kernel treats the filesystem

    i don't have any git repo's even close to that size so i can't perform
    the same analysis where i'm at.


    On Mon, Jan 11, 2021 at 1:45 PM Vicker, Darby J. (JSC-EG111)[Jacobs
    Technology, Inc.] <darby.vicker-1 at nasa.gov> wrote:
    >
    > Sure.  Its a custom configuration on commodity hardware, which is quite a bit newer than the luster servers.  The overall setup is a bit complicated to support HA - two servers with an external JBOD with ZFS to manage the drives and the file system.  PCS to do the failover.  But none of that is too relevant in terms of performance so here are the hardware specs.
    >
    > Servers:
    > 192 GB DDR4 2666 MHz ECC Memory
    > 16 total physical cores (2x Intel Xeon Gold 6144 CPU @ 3.50GHz)
    > LSI SAS Card (can't find exact model but very similar to the cards in the lustre servers)
    >
    > JBOD:
    > Supermicro 3.5"
    > 24x 10TB 7200 RPM Seagate HDD's
    >
    > ZFS is used to configure the drives in a RAID10 with a zfs file system built on the zpool.  This is exported via NFS.  The only NFS tuning we are doing is to increase RPCNFSDCOUNT to 128 and export with async.
    >
    > So the HW configuration is overall fairly similar.   This is another reason I'm hopeful that we'd be able to get our lustre MD performance as good or better than the NFS server given that the lustre MDS has SSD's and the NFS server has HDD's.
    >
    >
    > -----Original Message-----
    > From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Michael Di Domenico <mdidomenico4 at gmail.com>
    > Date: Monday, January 11, 2021 at 8:07 AM
    > Cc: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
    > Subject: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance
    >
    > perhaps i missed it somewhere, but in order to do a fair comparison
    > can you detail the hardware/software behind the nfs server?
    >
    >



More information about the lustre-discuss mailing list