[lustre-devel] Limitations of kernel read ahead

Latham, Robert J. robl at mcs.anl.gov
Wed Nov 7 12:32:37 PST 2018

On Tue, 2018-10-30 at 02:00 +0000, Li Xi wrote:
> Thank you for summarize this, James!
> I think everyone agrees that the current readahead algorithm of
> Lustre needs to be improved. And evidences show that the readahead
> algorithm of Linux kernel would not suitable for Lustre either. There
> are several reasons for this. In general, the readahead algorithm of
> kernel is designed for local file system with small readahead window.
> It is single thread, synchronous readahead, only usable for
> sequential read. Because the read operation of Lustre is has longer
> latency than local file system, while its bandwidth is typically
> higher than local file system, we need totally different algorithm
> for Lustre readahead. The readahead algorithm needs to be 1)
> asynchronous to hide latency for application 2) multiple threaded to
> utilize the high bandwidth 3) use big readahead window to align with
> the big RPC size 4) work for sequential read, stride read and
> potentially small & random read.

Please don't forget that HPC workloads are likely to fall into category


> The work of LU-8709 was started with these targets and got pretty
> good numbers even without detailed tuning. We (the Whamcloud team)
> would like to rework on it with a goal of merging it in the next
> releases of Lustre.
> Regards,
> Li Xi
> 在 2018/10/30 上午2:06,“James Simmons”<jsimmons at infradead.org> 写入:
>     Currently the lustre client has its own read ahead handling in
> the CLIO 
>     layer. The reason for this is due to some limitations in the read
> ahead
>     code for the linux kernel. Some work to use the kernel's read
> ahead was 
>     attempted for the LU-8964 work but the general work for LU-8964
> had other
>     issues. Alternative work to LU-8964 has emerged under ticket
>     https://jira.whamcloud.com/browse/LU-8709
>     with early code at:
>     https://review.whamcloud.com/#/c/23552
>     Also I have included a link to a presentation of this work and it
> gives
>     insight on how lustre does its own read ahead.
>     Now that this seems to be the targeted work for read ahead the
> discussion
>     has come up about why this new work doesn't use the kernel read
> ahead 
>     again. I wasn't involved in the discussion about the limitations
> but I 
>     have included the people interested in this work so progress can
> be done
>     to imporve the linux kernels version of read ahead.
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

More information about the lustre-devel mailing list