[lustre-discuss] Avoiding system cache when using ssd pfl extent

Fri May 20 01:49:49 PDT 2022

On 5/20/22 09:53, Andreas Dilger via lustre-discuss wrote:
> To elaborate a bit on Patrick's answer, there is no mechanism to do this on the *client*, because the performance difference between client RAM and server storage is still fairly significant, especially if the application is doing sub-page read or write operations.
> 
> However, on the *server* the OSS and MDS will *not* put flash storage into the page cache, because using the kernel page cache has a measurable overhead, and (at least in our testing) the performance of NVMe IOPS is actually better *without* the page cache because more CPU is available to handle RPCs.  This is controlled on the server with osd-ldiskfs.*.{read_cache_enable,writethrough_cache_enable}, default to 0 if the block device is non-rotational, default to 1 if block device is rotational.

Then my question is, what is it checking to determine non-rotational?

On our systems the NVMe disks have read/writethrough_cache_enable = 1 
(DDN SFA400NVXE) with
===
/dev/sde on /lustre/stor10/ost0000 (NVMe)
cat /sys/block/sde/queue/rotational
0
lctl get_param osd-ldiskfs.*.*cache*enable
osd-ldiskfs.stor10-OST0000.read_cache_enable=1
osd-ldiskfs.stor10-OST0000.writethrough_cache_enable=1

EXAScaler SFA CentOS 5.2.3-r5
kmod-lustre-2.12.6_ddn58-1.el7.x86_64
===

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: ake at hpc2n.umu.se  Mobile: +46 70 7716134  Fax: +46 90-580 14
WWW: http://www.hpc2n.umu.se