[lustre-discuss] ldiskfs performance degradation due to kernel swap hugging cpu

Abe Asraoui AbeA at supermicro.com
Fri Dec 28 16:42:50 PST 2018



+ lustre-discuss 


    
     Hi All.
    We are seeing low performance with lustre2.11 in ldiskfs configuration with obdfilter survey, not sure if this is a known issue.
    
    obdfilter survery under ldiskfs performance is impacted by kernel swap hugging cpu usage, current configurations is as follows:
    2 osts: ost1,ost2
    /dev/sdc on /mnt/mdt type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-MDT0000,mgs,osd=osd-ldiskfs,user_xattr,errors=remount-ro)
    /dev/sdb on /mnt/ost1 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0001,mgsnode=10.10.10.168 at o2ib,osd=osd-ldiskfs,errors=remount-ro)
    /dev/sda on /mnt/ost2 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0002,mgsnode=10.10.10.168 at o2ib,osd=osd-ldiskfs,errors=remount-ro)
    [root at oss100 htop-2.2.0]#
    [root at oss100 htop-2.2.0]# dkms status
    lustre-ldiskfs, 2.11.0, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
    spl, 0.7.6, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
    [root at oss100 htop-2.2.0]#
    sh ./obdsurvey-script.sh 
    Mon Dec 10 17:19:52 PST 2018 Obdfilter-survey for case=disk from oss100
    ost 2 sz 512000000K rsz 1024K obj 2 thr 2 write 134.52 [ 49.99, 101.96] rewrite 132.09 [ 49.99, 78.99] read 2566.74 [ 258.96, 2068.71] 
    ost 2 sz 512000000K rsz 1024K obj 2 thr 4 write 195.73 [ 76.99, 128.98] rewrite
    root at oss100 htop-2.2.0]# lctl dl
    0 UP osd-ldiskfs tempAA-MDT0000-osd tempAA-MDT0000-osd_UUID 9
    1 UP mgs MGS MGS 4
    2 UP mgc MGC10.10.10.168 at o2ib 65f231a0-8fd8-001d-6b0f-3e986f914178 4
    3 UP mds MDS MDS_uuid 2
    4 UP lod tempAA-MDT0000-mdtlov tempAA-MDT0000-mdtlov_UUID 3
    5 UP mdt tempAA-MDT0000 tempAA-MDT0000_UUID 8
    6 UP mdd tempAA-MDD0000 tempAA-MDD0000_UUID 3
    7 UP qmt tempAA-QMT0000 tempAA-QMT0000_UUID 3
    8 UP lwp tempAA-MDT0000-lwp-MDT0000 tempAA-MDT0000-lwp-MDT0000_UUID 4
    9 UP osd-ldiskfs tempAA-OST0001-osd tempAA-OST0001-osd_UUID 4
    10 UP ost OSS OSS_uuid 2
    11 UP obdfilter tempAA-OST0001 tempAA-OST0001_UUID 5
    12 UP lwp tempAA-MDT0000-lwp-OST0001 tempAA-MDT0000-lwp-OST0001_UUID 4
    13 UP osp tempAA-OST0001-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
    14 UP echo_client tempAA-OST0001_ecc tempAA-OST0001_ecc_UUID 2
    15 UP osd-ldiskfs tempAA-OST0002-osd tempAA-OST0002-osd_UUID 4
    16 UP obdfilter tempAA-OST0002 tempAA-OST0002_UUID 5
    17 UP lwp tempAA-MDT0000-lwp-OST0002 tempAA-MDT0000-lwp-OST0002_UUID 4
    18 UP osp tempAA-OST0002-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
    19 UP echo_client tempAA-OST0002_ecc tempAA-OST0002_ecc_UUID 2
    [root at oss100 htop-2.2.0]#
    root at oss100 htop-2.2.0]# lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sda 8:0 0 152.8T 0 disk /mnt/ost2
    sdb 8:16 0 152.8T 0 disk /mnt/ost1
    sdc 8:32 0 931.5G 0 disk /mnt/mdt
    sdd 8:48 0 465.8G 0 disk 
    \u251c\u2500sdd1 8:49 0 200M 0 part /boot/efi
    \u251c\u2500sdd2 8:50 0 1G 0 part /boot
    \u2514\u2500sdd3 8:51 0 464.6G 0 part 
    \u251c\u2500centos-root 253:0 0 50G 0 lvm /
    \u251c\u2500centos-swap 253:1 0 4G 0 lvm [SWAP]
    \u2514\u2500centos-home 253:2 0 410.6G 0 lvm /home
    nvme0n1 259:2 0 372.6G 0 disk 
    \u2514\u2500md124 9:124 0 372.6G 0 raid1 
    nvme1n1 259:0 0 372.6G 0 disk 
    \u2514\u2500md124 9:124 0 372.6G 0 raid1 
    nvme2n1 259:3 0 372.6G 0 disk 
    \u2514\u2500md125 9:125 0 354G 0 raid1 
    nvme3n1 259:1 0 372.6G 0 disk 
    \u2514\u2500md125 9:125 0 354G 0 raid1
     
    thanks,
    Abe
    
    
    



More information about the lustre-discuss mailing list