[Lustre-discuss] short writes

Kevin Van Maren kevin.van.maren at oracle.com
Thu Jul 8 15:48:41 PDT 2010


John Hammond wrote:
> On 07/08/2010 08:53 AM, Kevin Van Maren wrote:
>> Hi David,
>>
>> I've also seen short writes on local file systems -- can't even count
>> the number of times I've modified codes to use wrappers that handle
>> short reads/writes.  Not at all surprised you see them when suspending
>> the app.
>>
>> http://www.opengroup.org/onlinepubs/000095399/functions/write.html
>> "If write() is interrupted by a signal after it successfully writes some
>> data, it shall return the number of bytes written."
>> Similar language exists for read as well.  I always thought libc should
>> handle the retry for you by default, but I didn't write the spec.
>>
>> Signals are relatively rare, and the window is a bit smaller for a local
>> file system, which may be why they haven't seen it/properly dealt with
>> it yet.
>
> It also says "The issue of which files or file types are interruptible 
> is considered an implementation design issue. This is often affected 
> primarily by hardware and reliability issues."
>
> For Linux, the signal(7) manpage indicates that read(2), readv(2), 
> write(2), writev(2), and ioctl(2) calls on "slow" devices should 
> return -EINTR when interrupted by a signal, and goes on to say that 
> "slow" devices are ones "where the I/O call may block for an 
> indefinite time, for example, a terminal, pipe, or socket.  (A disk is 
> not a slow device according to this definition.)"

How about a network file system waiting for server failover (especially 
if it is not automatic)?

> Nowhere does it say something really helpfully clear like "Writing to 
> a regular file shall suspend the calling process until such time 
> as..." But, I interpret this to mean that operations on regular files 
> are not interruptible, and should not return -EINTR.  Moreover, I 
> understand that this is the consensus among those unlucky enough to care.
>
> On the other hand, there are some explicitly specified situations 
> which will result in short writes to a regular file, like file size 
> limits.

With NFS, "hard,intr" is the most sane configuration.  For Lustre, 
operations (should) become interruptible after the initial timeout 
period has passed.

Kevin




More information about the lustre-discuss mailing list