[lustre-discuss] ZFS tuning for MDT/MGS
Hans Henrik Happe
happe at nbi.dk
Tue Apr 2 06:41:44 PDT 2019
AFAIK, that is what sync=disabled does. It pretends syncs are commited.
It will flush after 5 seconds but there might be other output that will
stall it longer.
On 02/04/2019 14.28, Degremont, Aurelien wrote:
> This is very unlikely.
> The only reason that could happened is this hardware is acknowledging I/O to Lustre that it did not really commit to disk like writeback cache, or a Lustre bug.
>
> Le 02/04/2019 14:11, « lustre-discuss au nom de Hans Henrik Happe » <lustre-discuss-bounces at lists.lustre.org au nom de happe at nbi.dk> a écrit :
>
> Isn't there a possibility that the MDS falsely tells the client that a
> transaction has been committed to disk. After that the client might not
> be able to replay, if the MDS dies.
>
> Cheers,
> Hans Henrik
>
> On 19/03/2019 21.32, Andreas Dilger wrote:
> > You would need to lose the MDS within a few seconds after the client to
> > lose filesystem operations, since the clients will replay their
> > operations if the MDS crashes, and ZFS commits the current transaction
> > every 1s, so this setting only really affects "sync" from the client.
> >
> > Cheers, Andreas
> >
> > On Mar 19, 2019, at 12:43, George Melikov <mail at gmelikov.ru
> > <mailto:mail at gmelikov.ru>> wrote:
> >
> >> Can you explain the reason about 'zfs set sync=disabled mdt0'? Are you
> >> ready to lose last transaction on that mdt during power failure? What
> >> did I miss?
> >>
> >> 14.03.2019, 01:00, "Riccardo Veraldi" <Riccardo.Veraldi at cnaf.infn.it
> >> <mailto:Riccardo.Veraldi at cnaf.infn.it>>:
> >>> these are the zfs settings I use on my MDSes
> >>>
> >>> zfs set mountpoint=none mdt0
> >>> zfs set sync=disabled mdt0
> >>>
> >>> zfs set atime=off amdt0
> >>> zfs set redundant_metadata=most mdt0
> >>> zfs set xattr=sa mdt0
> >>>
> >>> if youor MDT partition is on a 4KB sector disk then you can use
> >>> ashift=12 when you create the filesystem but zfs is pretty smart and
> >>> in my case it recognized it automatically and used ashift=12
> >>> automatically.
> >>>
> >>> also here are the zfs kernel modules parameters i use to ahve better
> >>> performance. I use it on both MDS and OSSes
> >>>
> >>> options zfs zfs_prefetch_disable=1
> >>> options zfs zfs_txg_history=120
> >>> options zfs metaslab_debug_unload=1
> >>> #
> >>> options zfs zfs_vdev_scheduler=deadline
> >>> options zfs zfs_vdev_async_write_active_min_dirty_percent=20
> >>> #
> >>> options zfs zfs_vdev_scrub_min_active=48
> >>> options zfs zfs_vdev_scrub_max_active=128
> >>> #options zfs zfs_vdev_sync_write_min_active=64
> >>> #options zfs zfs_vdev_sync_write_max_active=128
> >>> #
> >>> options zfs zfs_vdev_sync_write_min_active=8
> >>> options zfs zfs_vdev_sync_write_max_active=32
> >>> options zfs zfs_vdev_sync_read_min_active=8
> >>> options zfs zfs_vdev_sync_read_max_active=32
> >>> options zfs zfs_vdev_async_read_min_active=8
> >>> options zfs zfs_vdev_async_read_max_active=32
> >>> options zfs zfs_top_maxinflight=320
> >>> options zfs zfs_txg_timeout=30
> >>> options zfs zfs_dirty_data_max_percent=40
> >>> options zfs zfs_vdev_async_write_min_active=8
> >>> options zfs zfs_vdev_async_write_max_active=32
> >>>
> >>> some people may disagree with me anyway after years of trying
> >>> different options I reached this stable configuration.
> >>>
> >>> then there are a bunch of other important Lustre level optimizations
> >>> that you can do if you are looking for performance increase.
> >>>
> >>> Cheers
> >>>
> >>> Rick
> >>>
> >>> On 3/13/19 11:44 AM, Kurt Strosahl wrote:
> >>>>
> >>>> Good Afternoon,
> >>>>
> >>>>
> >>>> I'm reviewing the zfs parameters for a new metadata system and I
> >>>> was looking to see if anyone had examples (good or bad) of zfs
> >>>> parameters? I'm assuming that the MDT won't benefit from a
> >>>> recordsize of 1MB, and I've already set the ashift to 12. I'm using
> >>>> an MDT/MGS made up of a stripe across mirrored ssds.
> >>>>
> >>>>
> >>>> w/r,
> >>>>
> >>>> Kurt
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> lustre-discuss mailing list
> >>>> lustre-discuss at lists.lustre.org <http:///touch/compose?to=lustre-discuss@lists.lustre.org>
> >>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o… <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
> >>>
> >>>
> >>> _______________________________________________
> >>> lustre-discuss mailing list
> >>> lustre-discuss at lists.lustre.org
> >>> <http:///touch/compose?to=lustre-discuss@lists.lustre.org>
> >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.o…
> >>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
> >>>
> >>
> >>
> >> ____________________________________
> >> Sincerely,
> >> George Melikov
> >>
> >> _______________________________________________
> >> lustre-discuss mailing list
> >> lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
More information about the lustre-discuss
mailing list