[Lustre-discuss] short writes
Kevin Van Maren
kevin.van.maren at oracle.com
Thu Jul 8 15:48:41 PDT 2010
John Hammond wrote:
> On 07/08/2010 08:53 AM, Kevin Van Maren wrote:
>> Hi David,
>>
>> I've also seen short writes on local file systems -- can't even count
>> the number of times I've modified codes to use wrappers that handle
>> short reads/writes. Not at all surprised you see them when suspending
>> the app.
>>
>> http://www.opengroup.org/onlinepubs/000095399/functions/write.html
>> "If write() is interrupted by a signal after it successfully writes some
>> data, it shall return the number of bytes written."
>> Similar language exists for read as well. I always thought libc should
>> handle the retry for you by default, but I didn't write the spec.
>>
>> Signals are relatively rare, and the window is a bit smaller for a local
>> file system, which may be why they haven't seen it/properly dealt with
>> it yet.
>
> It also says "The issue of which files or file types are interruptible
> is considered an implementation design issue. This is often affected
> primarily by hardware and reliability issues."
>
> For Linux, the signal(7) manpage indicates that read(2), readv(2),
> write(2), writev(2), and ioctl(2) calls on "slow" devices should
> return -EINTR when interrupted by a signal, and goes on to say that
> "slow" devices are ones "where the I/O call may block for an
> indefinite time, for example, a terminal, pipe, or socket. (A disk is
> not a slow device according to this definition.)"
How about a network file system waiting for server failover (especially
if it is not automatic)?
> Nowhere does it say something really helpfully clear like "Writing to
> a regular file shall suspend the calling process until such time
> as..." But, I interpret this to mean that operations on regular files
> are not interruptible, and should not return -EINTR. Moreover, I
> understand that this is the consensus among those unlucky enough to care.
>
> On the other hand, there are some explicitly specified situations
> which will result in short writes to a regular file, like file size
> limits.
With NFS, "hard,intr" is the most sane configuration. For Lustre,
operations (should) become interruptible after the initial timeout
period has passed.
Kevin
More information about the lustre-discuss
mailing list