[lustre-devel] [PATCH] lustre: check copy_from_iter/copy_to_iter return code

James Simmons jsimmons at infradead.org
Sat Jul 15 07:40:29 PDT 2017

On Fri, 14 Jul 2017, Al Viro wrote:

> On Thu, Jul 13, 2017 at 10:57:59PM +0200, Arnd Bergmann wrote:
> > Thanks for testing it!
> > 
> > That means we did not copy any data and the kernel continues with
> > an uninitialized buffer, right? The problem may be the definition of
> > 
> > struct kib_immediate_msg {
> >         struct lnet_hdr ibim_hdr;        /* portals header */
> >         char         ibim_payload[0]; /* piggy-backed payload */
> > } WIRE_ATTR;
> > 
> > The check that Al added will try to ensure that we don't write
> > beyond the size of the ibim_payload[] array, which unfortunately
> > is defined as a zero-byte array, so I can see why it will now
> > fail. However, it's already broken in mainline now, with or without
> > my patch.
> > 
> > Are you able to come up with a fix that avoids the warning in
> > 'allmodconfig' and makes the function do something reasonable
> > again?

Yes, I'm testing a fix right now which I will merge with the original
patch. Greg this patch will need to be sent to Linus as well so the
kernel release isn't broken for users.
> Might make sense to try and use valid C99 for "array of indefinite
> size as the last member", i.e.
> struct kib_immediate_msg {
>          struct lnet_hdr ibim_hdr;        /* portals header */
>          char         ibim_payload[]; /* piggy-backed payload */
> 	Zero-sized array as the last member is gcc hack predating that;
> looks like gcc gets confused into deciding that it knows the distance
> from the end of object...

I did some profiling and found gcc was doing the right thing. That
should be updated to a C99 flexable array in a latter patch. 

> 	Said that, are we really guaranteed the IBLND_MSG_SIZE bytes
> in there?

This is what the real bug was. In the current code we are telling
copy_from_iter and copy_to_iter that the number of bytes are always
IBLND_MSG_SIZE. Arnd thought this was always the size so in his
patch he was testing the returned result of copy_[from|to]_iter to 
IBLND_MSG_SIZE. This nearly always failed since variable sized messages 
are being created. The zero size I initially saw was from doing pings. 
When I later tested with pushing I/O packets of other sizes were
observed but none of them were IBLND_MSG_SIZE in size so they failed to 
transmit. As soon as I'm done testing I will send a patch.

More information about the lustre-devel mailing list