[lustre-discuss] ZFS tuning for MDT/MGS

Hans Henrik Happe happe at nbi.dk
Tue Apr 2 05:10:27 PDT 2019


Isn't there a possibility that the MDS falsely tells the client that a
transaction has been committed to disk. After that the client might not
be able to replay, if the MDS dies.

Cheers,
Hans Henrik

On 19/03/2019 21.32, Andreas Dilger wrote:
> You would need to lose the MDS within a few seconds after the client to
> lose filesystem operations, since the clients will replay their
> operations if the MDS crashes, and ZFS commits the current transaction
> every 1s, so this setting only really affects "sync" from the client. 
> 
> Cheers, Andreas
> 
> On Mar 19, 2019, at 12:43, George Melikov <mail at gmelikov.ru
> <mailto:mail at gmelikov.ru>> wrote:
> 
>> Can you explain the reason about 'zfs set sync=disabled mdt0'? Are you
>> ready to lose last transaction on that mdt during power failure? What
>> did I miss?
>>
>> 14.03.2019, 01:00, "Riccardo Veraldi" <Riccardo.Veraldi at cnaf.infn.it
>> <mailto:Riccardo.Veraldi at cnaf.infn.it>>:
>>> these are the zfs settings I use on my MDSes
>>>
>>>  zfs set mountpoint=none mdt0
>>>  zfs set sync=disabled mdt0
>>>
>>>  zfs set atime=off amdt0
>>>  zfs set redundant_metadata=most mdt0
>>>  zfs set xattr=sa mdt0
>>>
>>> if youor MDT partition is on a 4KB sector disk then you can use
>>> ashift=12 when you create the filesystem but zfs is pretty smart and
>>> in my case it recognized it automatically and used ashift=12
>>> automatically.
>>>
>>> also here are the zfs kernel modules parameters i use to ahve better
>>> performance. I use it on both MDS and OSSes
>>>
>>> options zfs zfs_prefetch_disable=1
>>> options zfs zfs_txg_history=120
>>> options zfs metaslab_debug_unload=1
>>> #
>>> options zfs zfs_vdev_scheduler=deadline
>>> options zfs zfs_vdev_async_write_active_min_dirty_percent=20
>>> #
>>> options zfs zfs_vdev_scrub_min_active=48
>>> options zfs zfs_vdev_scrub_max_active=128
>>> #options zfs zfs_vdev_sync_write_min_active=64
>>> #options zfs zfs_vdev_sync_write_max_active=128
>>> #
>>> options zfs zfs_vdev_sync_write_min_active=8
>>> options zfs zfs_vdev_sync_write_max_active=32
>>> options zfs zfs_vdev_sync_read_min_active=8
>>> options zfs zfs_vdev_sync_read_max_active=32
>>> options zfs zfs_vdev_async_read_min_active=8
>>> options zfs zfs_vdev_async_read_max_active=32
>>> options zfs zfs_top_maxinflight=320
>>> options zfs zfs_txg_timeout=30
>>> options zfs zfs_dirty_data_max_percent=40
>>> options zfs zfs_vdev_async_write_min_active=8
>>> options zfs zfs_vdev_async_write_max_active=32
>>>
>>> some people may disagree with me anyway after years of trying
>>> different options I reached this stable configuration.
>>>
>>> then there are a bunch of other important Lustre level optimizations
>>> that you can do if you are looking for performance increase.
>>>
>>> Cheers
>>>
>>> Rick
>>>
>>> On 3/13/19 11:44 AM, Kurt Strosahl wrote:
>>>>
>>>> Good Afternoon,
>>>>
>>>>
>>>>     I'm reviewing the zfs parameters for a new metadata system and I
>>>> was looking to see if anyone had examples (good or bad) of zfs
>>>> parameters?  I'm assuming that the MDT won't benefit from a
>>>> recordsize of 1MB, and I've already set the ashift to 12.  I'm using
>>>> an MDT/MGS made up of a stripe across mirrored ssds.
>>>>
>>>>
>>>> w/r,
>>>>
>>>> Kurt
>>>>
>>>>
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org <http:///touch/compose?to=lustre-discuss@lists.lustre.org>
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o… <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>>>
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> <http:///touch/compose?to=lustre-discuss@lists.lustre.org>
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o…
>>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>>>
>>
>>
>> ____________________________________
>> Sincerely,
>> George Melikov
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 


More information about the lustre-discuss mailing list