<div dir="ltr"><div class="" style="margin:0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px;background-color:rgb(224,240,255)"><div class="" style="margin:0px;padding:0px;line-height:1.5"><a class="" rel="scheremencev" id="commentauthor_1154612_verbose" href="https://jira.xyratex.com/secure/ViewProfile.jspa?name=scheremencev" style="color:rgb(59,115,175);text-decoration:none;padding:2px 0px 2px 19px;background-repeat:no-repeat">Sergey Cheremensev</a> comment - </div><div class="" style="margin:0px;padding:0px;line-height:1.5"><br></div></div><div class="" style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px;background-color:rgb(224,240,255)"><div class="" style="margin:9px 0px;padding:0px;border:1px solid rgb(204,204,204);font-size:12px;line-height:1.33333;font-family:monospace;border-radius:3px;background:rgb(245,245,245)"><div class="" style="margin:0px;padding:9px 12px"><pre style="margin-top:0px;margin-bottom:0px;padding:0px;max-height:30em;overflow:auto;word-wrap:normal"><span style="line-height:16px;white-space:normal">[49672.067906] mlx5_ib:mlx5_0:calc_sq_size:485:(pid 8297): wqe_size 192</span><br></pre><pre style="margin-top:0px;margin-bottom:0px;padding:0px;max-height:30em;overflow:auto;word-wrap:normal">[49672.067908] mlx5_ib:mlx5_0:calc_sq_size:507:(pid 8297): wqe count(65536) exceeds limits(16384)
[49672.067910] mlx5_ib:mlx5_0:create_kernel_qp:1051:(pid 8297): err -12
</pre></div></div><p style="margin:10px 0px 0px;padding:0px">According to above data mlx5 has internal limit for wqe count 16384:</p><div class="" style="margin:9px 0px;padding:0px;border:1px solid rgb(204,204,204);font-size:12px;line-height:1.33333;font-family:monospace;border-radius:3px;background:rgb(245,245,245)"><div class="" style="margin:0px;padding:9px 12px"><pre style="margin-top:0px;margin-bottom:0px;padding:0px;max-height:30em;overflow:auto;word-wrap:normal"> wq_size = roundup_pow_of_two(attr->cap.max_send_wr * wqe_size);
qp->sq.wqe_cnt = wq_size / MLX5_SEND_WQE_BB;
if (qp->sq.wqe_cnt > (1 << MLX5_CAP_GEN(dev->mdev, log_max_qp_sz))) {
mlx5_ib_dbg(dev, "wqe count(%d) exceeds limits(%d)\n",
qp->sq.wqe_cnt,
1 << MLX5_CAP_GEN(dev->mdev, log_max_qp_sz));
return -ENOMEM;
}
</pre></div></div><p style="margin:10px 0px 0px;padding:0px">So -12(ENOMEM) in message "Can't create QP" doesn't point to any problems with free memory in the system.<br>Imo better error code here is -EINVAL.<br>It seems peer_credits==16 is the maximum value that is supported by mlx5.</p></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 9, 2016 at 5:53 PM, James Simmons <span dir="ltr"><<a href="mailto:jsimmons@infradead.org" target="_blank">jsimmons@infradead.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">From: Dmitry Eremin <<a href="mailto:dmitry.eremin@intel.com">dmitry.eremin@intel.com</a>><br>
<br>
Decrease cap.max_send_wr until it is accepted by rdma_create_qp()<br>
<br>
Signed-off-by: Dmitry Eremin <<a href="mailto:dmitry.eremin@intel.com">dmitry.eremin@intel.com</a>><br>
Intel-bug-id: <a href="https://jira.hpdd.intel.com/browse/LU-7124" rel="noreferrer" target="_blank">https://jira.hpdd.intel.com/browse/LU-7124</a><br>
Reviewed-on: <a href="http://review.whamcloud.com/18347" rel="noreferrer" target="_blank">http://review.whamcloud.com/18347</a><br>
Reviewed-by: Olaf Weber <<a href="mailto:olaf@sgi.com">olaf@sgi.com</a>><br>
Reviewed-by: Doug Oucharek <<a href="mailto:doug.s.oucharek@intel.com">doug.s.oucharek@intel.com</a>><br>
Reviewed-by: Oleg Drokin <<a href="mailto:oleg.drokin@intel.com">oleg.drokin@intel.com</a>><br>
Signed-off-by: James Simmons <<a href="mailto:jsimmons@infradead.org">jsimmons@infradead.org</a>><br>
---<br>
.../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 11 ++++++++++-<br>
1 files changed, 10 insertions(+), 1 deletions(-)<br>
<br>
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c<br>
index d99b4fa..bc179a2 100644<br>
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c<br>
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c<br>
@@ -768,7 +768,12 @@ kib_conn_t *kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id *cmid,<br>
<br>
conn->ibc_sched = sched;<br>
<br>
- rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr);<br>
+ do {<br>
+ rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr);<br>
+ if (!rc || init_qp_attr->cap.max_send_wr < 16)<br>
+ break;<br>
+ } while (rc);<br>
+<br>
if (rc) {<br>
CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d\n",<br>
rc, init_qp_attr->cap.max_send_wr,<br>
@@ -776,6 +781,10 @@ kib_conn_t *kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id *cmid,<br>
goto failed_2;<br>
}<br>
<br>
+ if (init_qp_attr->cap.max_send_wr != IBLND_SEND_WRS(conn))<br>
+ CDEBUG(D_NET, "original send wr %d, created with %d\n",<br>
+ IBLND_SEND_WRS(conn), init_qp_attr->cap.max_send_wr);<br>
+<br>
LIBCFS_FREE(init_qp_attr, sizeof(*init_qp_attr));<br>
<br>
/* 1 ref for caller and each rxmsg */<br>
<span class="HOEnZb"><font color="#888888">--<br>
1.7.1<br>
<br>
_______________________________________________<br>
lustre-devel mailing list<br>
<a href="mailto:lustre-devel@lists.lustre.org">lustre-devel@lists.lustre.org</a><br>
<a href="http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org" rel="noreferrer" target="_blank">http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org</a><br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Alexey Lyashkov <strong>·</strong> Technical lead for a Morpheus team<br>
Seagate Technology, LLC<br>
<a href="http://www.seagate.com" target="_blank">www.seagate.com</a><br><div><a href="http://www.lustre.org" target="_blank">www.lustre.org</a></div></div></div>
</div>