[Lustre-discuss] Lustre-126.96.36.199 over o2ib gives Input/Output error while executing lctl ping
vipul at chelsio.com
Fri Feb 26 00:56:13 PST 2010
This was very helpful. Thanks a lot for your response.
From: He.Huang at Sun.COM [mailto:He.Huang at Sun.COM]
Sent: 26 February 2010 02:27
To: Vipul Pandya
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Lustre-188.8.131.52 over o2ib gives
Input/Output error while executing lctl ping
On Mon, Feb 22, 2010 at 03:22:52AM -0800, Vipul Pandya wrote:
> Hello Issac,
> I lowered the map_on_demand value to 16 and now it works fine.
> However, I had once concern, whether lowering down this map_on_demand
> value would impact the performance of Lustre or not?
For iWARP, you probably have no alternative. I remembered that there's
a restriction somewhere in the iWARP stack that limits the size of SQs
(which was why the rdma_create_qp errors happened), and lowering
map_on_demand is the only way to reduce Lustre SQ length.
For infiniband, lowering map_on_demand essentially reduces the # of
RDMA WQE needed for each Lustre bulk data movement, at the cost of
memory registration/deregistration at most per bulk transfer; without
map_on_demand the o2iblnd uses a static MR so there's no memory
registration cost. There could a point in the # of frags of the bulk
buffer, where the cost of handling RDMA WQEs (which usually equals
the # of frags) exceeds the cost of MR, and that's what you should
set map_on_demand to. However, since both costs are mostly determined
by HCA hardware/firmware implementation, there's no one good setting
for all interconnects, and you can only find it by testing. The LNet
selftest is a useful tool for running such tests:
Hope this helps,
More information about the lustre-discuss