[Lustre-devel] o2iblnd bug ?
Zhen.Liang at Sun.COM
Thu Jul 1 14:27:33 PDT 2010
Nic Henke wrote:
> There looks to be a bug in the o2iblnd (and maybe other LNDs...) in
> When tx_lntmsg has a reply allocated (lnet_create_reply_msg) for a
> GET_REQ, we are committed to lnet_finalize that no matter the status of
> the RDMA. However, kiblnd_tx_done will call lnet_finalize() with the
> 'error' status on both the request (lntmsg) and the allocated reply.
> This could lead to the upper layer receiving a REPLY event for a message
> it has already nuked due to the EIO on the originial request.
I think lnet_create_reply_msg has already taken an extra reference on MD
(lnet_create_reply_msg()->lnet_commit_md()), so the upper layer message
shouldn't be nuked before the last event(unlinked).
> In the pttlnd and qswlnd, they seem to handle this properly. They will
> complete the request with rc=0, then complete the reply with rc=-EIO.
> So - is this really a bug or just inconsequential differences ?
> This looks to be present in HEAD, as well as b1_8 and friends.
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
More information about the lustre-devel