[lustre-discuss] ZFS tuning for MDT/MGS

Andreas Dilger adilger at whamcloud.com
Tue Mar 19 13:32:10 PDT 2019


You would need to lose the MDS within a few seconds after the client to lose filesystem operations, since the clients will replay their operations if the MDS crashes, and ZFS commits the current transaction every 1s, so this setting only really affects "sync" from the client.

Cheers, Andreas

On Mar 19, 2019, at 12:43, George Melikov <mail at gmelikov.ru<mailto:mail at gmelikov.ru>> wrote:

Can you explain the reason about 'zfs set sync=disabled mdt0'? Are you ready to lose last transaction on that mdt during power failure? What did I miss?

14.03.2019, 01:00, "Riccardo Veraldi" <Riccardo.Veraldi at cnaf.infn.it<mailto:Riccardo.Veraldi at cnaf.infn.it>>:
these are the zfs settings I use on my MDSes

 zfs set mountpoint=none mdt0
 zfs set sync=disabled mdt0

 zfs set atime=off amdt0
 zfs set redundant_metadata=most mdt0
 zfs set xattr=sa mdt0

if youor MDT partition is on a 4KB sector disk then you can use ashift=12 when you create the filesystem but zfs is pretty smart and in my case it recognized it automatically and used ashift=12 automatically.

also here are the zfs kernel modules parameters i use to ahve better performance. I use it on both MDS and OSSes

options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#options zfs zfs_vdev_sync_write_min_active=64
#options zfs zfs_vdev_sync_write_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

some people may disagree with me anyway after years of trying different options I reached this stable configuration.

then there are a bunch of other important Lustre level optimizations that you can do if you are looking for performance increase.

Cheers

Rick

On 3/13/19 11:44 AM, Kurt Strosahl wrote:

Good Afternoon,


    I'm reviewing the zfs parameters for a new metadata system and I was looking to see if anyone had examples (good or bad) of zfs parameters?  I'm assuming that the MDT won't benefit from a recordsize of 1MB, and I've already set the ashift to 12.  I'm using an MDT/MGS made up of a stripe across mirrored ssds.


w/r,

Kurt



_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<http:///touch/compose?to=lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o…<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>



_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<http:///touch/compose?to=lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o…<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>


____________________________________
Sincerely,
George Melikov

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190319/2f7fe54b/attachment.html>


More information about the lustre-discuss mailing list