[lustre-discuss] dd oflag=direct error (512 byte Direct I/O)

Patrick Farrell paf at cray.com
Tue Oct 30 08:10:55 PDT 2018


Andreas,

An interesting thought on this, as the same limitation came up recently in discussions with a Cray customer.  Strictly honoring the direct I/O expectations around data copying is apparently optional.  GPFS is a notable example – It allows non page-aligned/page-size direct I/O, but it apparently (This is second hand from a GPFS knowledgeable person, so take with a grain of salt) uses the buffered path (data copy, page cache, etc) and flushes it, O_SYNC style.  My understanding from conversations is this is the general approach taken by file systems that support unaligned direct I/O – they cheat a little and do buffered I/O in those cases.

So rather than refusing to perform unaligned direct I/O, we could emulate the approach taken by (some) other file systems.  There’s no clear standard here, but this is an option others have taken that might improve the user experience.  (I believe we persuaded our particular user to switch their code away from direct I/O, since they had no real reason to be using it.)


  *   Patrick

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of 김형근 <okok102928 at fusiondata.co.kr>
Date: Sunday, October 28, 2018 at 11:40 PM
To: Andreas Dilger <adilger at whamcloud.com>
Cc: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O)


The software I use is RedHat Virtualization. When using Posix compatible FS, it seems to perform direct I / O with a block size of 256512 bytes.



If I can't resolve the issue with my storage configuration, I will contact RedHat.



Your answer was very helpful.

Thank you.







________________________________





보내는사람 : Andreas Dilger <adilger at whamcloud.com>



받는사람 : 김형근 <okok102928 at fusiondata.co.kr>



참조 : lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>



보낸 날짜 : 2018-10-25 16:47:58







제목 : Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O)







On Oct 25, 2018, at 15:05, 김형근
wrote:
>
> Hi.
> It's a pleasure to meet you, the lustre specialists.
> (I do not speak English well ... Thank you for your understanding!)

Your english is better than my Korean. :-)

> I used the dd command in lustre mount point. (using the oflag = direct option)
>
> ------------------------------------------------------------
> dd if = / dev / zero of = / mnt / testfile oflag = direct bs = 512 count = 1
> ------------------------------------------------------------
>
> I need direct I / O with 512 byte block size.
> This is a required check function on the software I use.

What software is it? Is it possible to change the application to use
4096-byte alignment?

> But unfortunately, If the direct option is present,
> bs must be a multiple of 4K (4096) (for 8K, 12K, 256K, 1M, 8M, etc.) for operation.
> For example, if you enter a value such as 512 or 4095, it will not work. The error message is as follows.
>
> 'error message: dd: error writing [filename]: invalid argument'
>
> My test system is all up to date. (RHEL, lustre-server, client)
> I have used both ldiskfs and zfs for backfile systems. The result is same.
>
>
> My question is simply two.
>
> 1. Why does DirectIO work only in 4k multiples block size?

The client PAGE_SIZE on an x86 system is 4096 bytes. The Lustre client
cannot cache data smaller than PAGE_SIZE, so the current implementation
is limited to have O_DIRECT read/write being a multiple of PAGE_SIZE.

I think the same would happen if you try to use O_DIRECT on a disk with
4096-byte native sector drive (https://en.wikipedia.org/w/index.php?title=Advanced_Format§ion=5#4K_native )?

> 2. Can I change the settings of the server and client to enable 512bytes of DirectIO?

This would not be possible without changing the Lustre client code.
I don't know how easily this is possible to do and still ensure that
the 512-byte writes are handled correctly.

So far we have not had other requests to change this limitation, so
it is not a high priority to change on our side, especially since
applications will have to deal with 4096-byte sectors in any case.

Cheers, Andreas
---
Andreas Dilger
Principal Lustre Architect
Whamcloud








-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20181030/4beb50c4/attachment-0001.html>


More information about the lustre-discuss mailing list