[lustre-discuss] Lustre on ZFS pooer direct I/O performance

Fri Oct 14 14:22:14 PDT 2016

Patrick
I thought at one time there was an inode lock held for the duration of the direct I/O read or write. So that even if one had multiple application threads writing direct, only one was "in flight" at a time. Has that changed?
John

Sent from my iPhone

> On Oct 14, 2016, at 3:16 PM, Patrick Farrell <paf at cray.com> wrote:
> 
> Sorry, I phrased one thing wrong:
> I said "transferring to the network", but it's actually until it's received confirmation the data has been received successfully, I believe.
> 
> In any case, only one I/O (per thread) can be outstanding at a time with direct I/O.
>  
> From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Patrick Farrell <paf at cray.com>
> Sent: Friday, October 14, 2016 3:12:22 PM
> To: Riccardo Veraldi; lustre-discuss at lists.lustre.org
> Subject: Re: [lustre-discuss] Lustre on ZFS pooer direct I/O performance
>  
> Riccardo,
> 
> While the difference is extreme, direct I/O write performance will always be poor.  Direct I/O writes cannot be asynchronous, since they don't use the page cache.  This means Lustre cannot return from one write (and start the next) until it has finished transferring the data to the network.
> 
> This means you can only have one I/O in flight at a time.  Good write performance from Lustre (or any network filesystem) depends on keeping a lot of data in flight at once.
> 
> What sort of direct write performance were you hoping for?  It will never match that 800 MB/s from one thread you see with buffered I/O.
> 
> - Patrick
>  
> From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Riccardo Veraldi <Riccardo.Veraldi at cnaf.infn.it>
> Sent: Friday, October 14, 2016 2:22:32 PM
> To: lustre-discuss at lists.lustre.org
> Subject: [lustre-discuss] Lustre on ZFS pooer direct I/O performance
>  
> Hello,
> 
> I would like how may I improve the situation of my lustre cluster.
> 
> I have 1 MDS and 1 OSS with 20 OST defined.
> 
> Each OST is a 8x Disks RAIDZ2.
> 
> A single process write performance is around 800MB/sec
> 
> anyway if I force direct I/O, for example using oflag=direct in dd, the 
> write performance drop as low as 8MB/sec
> 
> with 1MB block size. And each write it's about 120ms latency.
> 
> I used these ZFS settings
> 
> options zfs zfs_prefetch_disable=1
> options zfs zfs_txg_history=120
> options zfs metaslab_debug_unload=1
> 
> i am quite worried for the low performance.
> 
> Any hints or suggestions that may help me to improve the situation ?
> 
> 
> thank you
> 
> 
> Rick
> 
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20161014/84c27a6b/attachment.htm>