[lustre-devel] lustre and loopback device
Jinshan Xiong
jinshan.xiong at gmail.com
Wed May 23 14:03:44 PDT 2018
It turned out that I was looking at a 4.11.x kernel where it only has
'--direct-io' but there was no '--sector-size' supported.
With latest kernel, we can do AIO+DIO so there is no changes necessary from
kernel. Patch https://review.whamcloud.com/32416 attempts to accomplish
that. Since AIO is used, the I/O size from loop device is no longer
important because the I/Os will be merged at OSC layer.
If everything works eventually, we can set up loop device over lustre like:
losetup --direct-io -f --sector-size=4096 <lustre_reg_file>
Thanks,
Jinshan
On Tue, May 22, 2018 at 3:55 PM, NeilBrown <neilb at suse.com> wrote:
> On Fri, Mar 30 2018, Jinshan Xiong wrote:
>
> > + Andreas.
> >
> > A few problems:
>
> Sorry that it has been 7 weeks, but I've finally scheduled time to
> have a proper look at this.
>
> > 1. Linux loop device won't work upon Lustre with direct IO mode because
> > Lustre direct IO has to be pagesize aligned, and there seems no way of
> > changing sector size to pagesize for Linux loop device;
>
> The sector size for a loop device can be set with the --sector-size
> argument to losetup (or the LOOP_SET_BLOCK_SIZE ioctl). This is done
> from user-space, not from in the lustre module of course.
> open(O_DIRECT) is documented as having size/alignment restrictions,
> so I think a good case could be made to change the handling of
> "losetup --raw" to imply a "--sector-size" setting, if we could
> determine an appropriate size automatically.
> The XFS_IOC_DIOINFO ioctl (see man xfsctl) can be used to ask a
> filesystem about alignment requirements, but is currently only
> supported for XFS. If we added support to lustre, and asked util-linux
> to use it to help configure a loop device, I suspect we could get
> success.
>
> There would probably be a request to hoist the ioctl out of xfs and add
> it to VFS. I cannot predict how that would go, but I think it would be
> good to pursue this approach.
>
> You could try it in you own testing by using
> losetup -r --sector=size=4096 /dev/loopX filename
> to create a loop device.
>
> > 2. 64KB is not an optimal RPC size for Lustre, so yes eventually we are
> > going to see throughput issue if the RPC size is limited to 64KB;
>
> So let's find out where the 64KB limit is imposed, and raise it.
> Maybe it comes from
> lo->tag_set.queue_depth = 128;
> combined with the default sector size of 512.
> If so, then increasing the sector size to 4K should raise the RPC
> size 512K.
>
> > 3. It's hard to do I/O optimization more with Linux loop device. With
> > direct I/O by default, it has to wait for the current I/O to complete
> > before it can send the next one. This is not good. I have revised
> > llite_lloop driver so that it can do async direct I/O. The performance
> > boosts significantly by doing so.
>
> This surprises me. Looking at the code in loop.c, I see a field
> ->use_aio which is set when direct_io is used (->use_dio), except
> for FLUSH DISCARD and WRITE_ZEROES.
> ->use_dio is disabled if the filesystem has a block device
> (->i_sb->s_bdev != NULL) and alignment doesn't match, but that
> wouldn't apply to lustre.
> Linux gained aio in loop in Linux 4.4. What kernel version were you
> looking at?
>
> >
> > I tried to increase the sector size of Linux loop device and also
> > max_{hw_}sectors_kb but it didn't work. Please let me know if there
> exists
> > ways of doing that.
>
> If --sector-size option to losetup doesn't work, we will have to make it
> work.
>
> Thanks,
> NeilBrown
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180523/44e7017c/attachment.html>
More information about the lustre-devel
mailing list