[lustre-discuss] poor performance on reading small files

Oucharek, Doug S doug.s.oucharek at intel.com
Wed Aug 3 11:42:55 PDT 2016

Also note: If you are using IB, these small reads will make use of RDMA.  LNet only uses rdma_writes (historical reasons for this) so the client has to use IB immediate messages to tell the server to write the 20kb file to the client.  The extra round-trip handshake involved with this will add latency to each file read.  That could be why writes, which don’t need this extra handshake, perform better than the reads.

The bigger the files (i.e. the more data moved per rdma_write) the less the additional overhead of the handshake will be noticed.


> On Aug 3, 2016, at 11:32 AM, Jeff Johnson <jeff.johnson at aeoncomputing.com> wrote:
> On 8/3/16 10:57 AM, Dilger, Andreas wrote:
>> On Jul 29, 2016, at 03:33, Oliver Mangold <Oliver.Mangold at EMEA.NEC.COM> wrote:
>>> On 29.07.2016 04:19, Riccardo Veraldi wrote:
>>>> I am using lustre on ZFS.
>>>> While write performances are excellent also on smaller files, I find
>>>> there is a drop down in performance
>>>> on reading 20KB files. Performance can go as low as 200MB/sec or even
>>>> less.
>>> Getting 200 MB/s with 20kB files means you have to do 10000 metadata
>>> ops/s. Don't want to say it is impossible to get more than that, but at
>>> least with MDT on ZFS this doesn't sound bad either. Did you run an
>>> mdtest on your system? Maybe some serious tuning of MD performance is in
>>> order.
>> I'd agree with Oliver that getting 200MB/s with 20KB files is not too bad.
>> Are you using HDDs or SSDs for the MDT and OST devices?  If using HDDs,
>> are you using SSD L2ARC to allow the metadata and file data be cached in
>> L2ARC, and allowing enough time for L2ARC to be warmed up?
>> Are you using TCP or IB networking?  If using TCP then there is a lower
>> limit on the number of RPCs that can be handled compared to IB.
>> Cheers, Andreas
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> Also consider that 20KB of data per lnet RPC, assuming a 1MB RPC, to move 20KB files at 200MB/sec into a non-striped LFS directory you are using EDR for lnet? 100GB Ethernet?
> --Jeff
> -- 
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
> 4170 Morena Boulevard, Suite D - San Diego, CA 92117
> High-performance Computing / Lustre Filesystems / Scale-out Storage
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

More information about the lustre-discuss mailing list