[Lustre-discuss] short writes

Brian J. Murrell Brian.Murrell at oracle.com
Thu Jul 8 07:05:09 PDT 2010


On Thu, 2010-07-08 at 23:25 +1000, David Singleton wrote: 
> The POSIX standard pretty clearly allows short writes to occur (number of
> bytes written less than requested in a successful call to write)

Yes, I believe that is true.  The write(2) manpage also indicates that a
write of less than the requested bytes is possible.

> but its
> not something you see very often

Not with local disk, no.

> and I dont think many users/applications
> expect it to occur when writing to disk based files.

Indeed.  Sadly many applications don't expect a lot of conditions that
are possible and allowed but historically not seen.

> We manage jobs using
> simple SIGSTOP/SIGCONT based suspend/resume and occasionally jobs will flag
> a short write immediately after a SIGCONT.

Interesting.

> The application incorrectly
> treats this as an error and aborts.

Yes, if an application is treating a short write as an error, I do
believe that that is an application defect.

> Adding code to complete the write
> appears to fix the problem (as you'd hope).

No surprise.  :-)

> Now we are at the stage of
> "debating" with the application developers whether it's their problem or
> Lustre's.

Well, I'm not sure how much of a debate there is there.  If POSIX is
clearly allowing this and given that Lustre is (mostly, if not
completely) POSIX compliant, it should be clear that this is not a
defect in Lustre and that the application(s) should be fixed to handle
what is legal behaviour.

> Is this considered normal Lustre behaviour?

Hopefully one of the developers with a more intimate knowledge can
answer definitively about particular conditions in Lustre that can cause
short writes, but it would not at all be surprising that if POSIX allows
it and we have a need to take advantage of it, we would.

Afterall, from an application point of view, if there were some
situation in the filesystem in that it simply cannot perform the
complete write, but could write a smaller portion with a possibility
that another write will be successful also, wouldn't that be better than
an outright EIO?

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100708/a4e8e1ed/attachment.pgp>


More information about the lustre-discuss mailing list