<div dir="ltr">+ Andreas.<div><br></div><div>A few problems:</div><div>1. Linux loop device won't work upon Lustre with direct IO mode because Lustre direct IO has to be pagesize aligned, and there seems no way of changing sector size to pagesize for Linux loop device;</div><div>2. 64KB is not an optimal RPC size for Lustre, so yes eventually we are going to see throughput issue if the RPC size is limited to 64KB;</div><div>3. It's hard to do I/O optimization more with Linux loop device. With direct I/O by default, it has to wait for the current I/O to complete before it can send the next one. This is not good. I have revised llite_lloop driver so that it can do async direct I/O. The performance boosts significantly by doing so.</div><div><br></div><div>I tried to increase the sector size of Linux loop device and also max_{hw_}sectors_kb but it didn't work. Please let me know if there exists ways of doing that.</div><div><br></div><div>Thanks,</div><div>Jinshan</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 30, 2018 at 12:12 PM, James Simmons <span dir="ltr"><<a href="mailto:jsimmons@infradead.org" target="_blank">jsimmons@infradead.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

> On Fri, Mar 23 2018, James Simmons wrote:<br>

><br>

> > Hi Neil<br>

> ><br>

> >       So once long ago lustre had its own loopback device due to the<br>

> > upstream loopback device not supporting Direct I/O. Once it did we<br>

> > dropped support for our custom driver. Recently their has been interest<br>

> > in using the loopback driver and Jinshan discussed with me about reviving<br>

> > our custom driver which I'm not thrilled about. He was seeing problems<br>

> > with Direct I/O above 64K. Do you know the details why that limitation<br>

> > exist. Perhaps it can be resolved or maybe we are missing something?<br>

> > Thanks for your help.<br>

><br>

> Hi James, and Jinshan,<br>

>  What sort of problems do you see with 64K DIO requests?<br>

>  Is it a throughput problem or are you seeing IO errors?<br>

>  Would it be easy to demonstrate the problem in a cluster<br>

>  comprising a few VMs, or is real hardware needed?  If VMs are OK,<br>

>  can you tell me exactly how to duplicate the problem?<br>

><br>

>  If loop gets a multi-bio request, it will allocate a bvec array<br>

>  to hold all the bio_vecs.  If there are more than 256 pages (1Meg)<br>

>  in a request, this could easily fail. 5 consecutive 64K requests on a<br>

>  machine without much free memory could hit problems here.<br>

>  If that is the problem, it should be easy to fix (request the number<br>

>  given to blk_queue_max_hw_sectors).<br>

<br>

Jinshan can you post a reproducer so we can see the problem.<br>

</blockquote></div><br></div>