[Lustre-devel] proposal on implementing a new readahead in clio

Sun Jan 24 22:55:09 PST 2010

We need do readahead asynchronously, but Windows kernel doesn't give us an 
easy solution. Here are the issues for Windows readahead:

1, Windows kenrel (VM) doesn't provide kernel drivers an equivalent 
grab_cache_page_nowait_gfp() to allocate an empty/invalid page. So in 
ll_readpage(), it's too late for WNC to grab more pages for readahead.

2, The routines provided by Windows kernel to allocate page cache are 
synchronous and they won't return until the requested pages are fetched.

So we plan to start a thread pool, and dispatch the readahead requests to 
these threads instead of blocking user thread.

We can group the threads by several ways:
1, request per random thread, without any specify order. we just start a 
fixed number of threads and queue the readahead request to any thread of 
the thread pool.
    this is the decision we made during WNC readahead meeting last week.
2, thread per file (file) or thread per open instance (fd)
3, thread per ost, we need divide the readahead request to several which 
are stripe boundary aligned.

regards,
matt

On 2010/1/25 12:05, Nicolas Williams wrote:
> On Sun, Jan 24, 2010 at 09:01:46AM +0800, jay wrote:
>> Alexey Lyashkov wrote:
>>> I correctly understand: you suggest a spawn one new thread per open
>>> file?
>>> so if client have 10 processes, and each process is open 100 files, you
>>> need spawn 1000 new threads?
>>>
>> No, per process readahead, or some system readahead thread pool, this is
>> because most of those threads are sleeping, and it consumes little time
>> to issue readahead requests. The idea behind the scheme is to issue
>> readahead rpcs async.
>
> Sleeping threads do consume memory resources, and context switches
> between them do add cache pressure.  The read ahead work should all be
> async, in which case you need no more readahead threads than you have
> CPUs.
>
>> BTW, I'm not going to implement what you mentioned in linux, because I
>> don't think this is a good idea, as what I said in design doc. However,
>> we HAVE to have an async thread pool to implement readahead for windows.
>> Windows doesn't have an interface of issuing async read request, lack of
>> a mechanism to have page lock or similar things - what a pity!
>
> But surely you can still do the readaheads asynchronously.  Say you
> think that block N of some file will be needed soon: so you issue the
> read ahead of time.  You'll need to place the data somewhere, and
> hopefully that will be somewhere that the host OS's VFS sub-system
> (Windows in your case) can either provide or accept -- if not you'll
> need to do a copy later, but you're still able to send the read request,
> and process the reply, asynchronously.
>
> Nico