<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi,<br>
<div class="moz-forward-container"> <br>
I am trying to build lustre 2.5.0 against
MLNX_OFED_LINUX-2.2-1.0.1-rhel6.4-x86_64 on CentOS6.4 with kernel
version 2.6.32-358.<br>
But I am not able to set lnet config settings properly. I used
settings suggested in lustre 2.x manual. But then not able to get
network up using lctl.<br>
<br>
Details:<br>
<br>
I have two server machines, one for mgs+mdt and second for oss and
one client machine. I want to setup Infiniband on all these
machines.<br>
I could run below steps successfully for all the three machines:<br>
1. Run script mlnxofedinstall <br>
<font face="Courier New"># ./mlnxofedinstall -vvv
--add-kernel-support --without-32bit --without-fw-update --hpc</font><br>
2. Restart openibd service<br>
<font face="Courier New"># /etc/init.d/openibd restart</font><br>
3. configure ib0 interface.<br>
4. configure lustre with o2ib<br>
<font face="Courier New">#</font> <font face="Courier New">./configure
--with-linux=Path_to_linux-2.6.32-358.18.1.el6
--with-o2ib=/usr/src/ofa_kernel/default/</font> <br>
<br>
5. make lustre rpms:<br>
<font face="Courier New"> # make rpms</font><br>
This gave me below compilation error <br>
I looked online for this error and found bug registered on the
same: <a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://jira.hpdd.intel.com/browse/LU-4266">https://jira.hpdd.intel.com/browse/LU-4266</a><br>
Below patch from above link solved the problem and hence I could
build lustre rpms:<br>
<a class="moz-txt-link-freetext"
href="http://review.whamcloud.com/#/c/8451/1">http://review.whamcloud.com/#/c/8451/1</a><br>
<br>
Now first I want to do the Infiniband setup for mgs and mdt on
single machine which also has Ethernet IP. Then I want to format
and mount mgs and mdt.<br>
So I installed above created lustre rpms and then added below line
in <span style="color: rgb(0, 0, 0); font-family: monospace;
font-size: medium; font-style: normal; font-variant: normal;
font-weight: normal; letter-spacing: normal; line-height:
normal; orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px; display:
inline !important; float: none;">/etc/modprobe.d/lustre.conf</span><span
style="color: rgb(0, 0, 0); font-family: monospace; font-size:
medium; font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal; orphans:
auto; text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px; display: inline !important;
float: none;"><br>
</span>options lnet networks=o2ib(ib0)<br>
<br>
Then I rebooted the machine to remove all lustre related modules
including lnet and then ran<font face="Courier New"> modprobe lnet</font>
command to add above parameters and the ran <font face="Courier
New">lctl network up</font> which is giving me below error:<br>
LNET configure error 100: Network is down<br>
<br>
I looked online and found below discussion on same error:<br>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/pipermail/lustre-discuss/2010-June/013510.html">http://lists.lustre.org/pipermail/lustre-discuss/2010-June/013510.html</a><br>
<br>
As per suggestion in above mail I tried with below line in <span
style="color: rgb(0, 0, 0); font-family: monospace; font-size:
medium; font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal; orphans:
auto; text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px; display: inline !important;
float: none;">/etc/modprobe.d/lustre.conf</span>. In below
command for IB_IP, I have given infiniband IP.<br>
options lnet <b>networks=o2ib(ib0)</b> routes="tcp0 IB_IP@o2ib"<br>
This command hangs for around 2 to 3 minutes and then gives error:
Write failed: Broken pipe. Same is the case for "options lnet <b>networks=o2ib(ib0)</b>"<br>
But if I set: options lnet <b>networks=tcp0(eth0),o2ib(ib0)</b>
routes="tcp1 IB_IP@o2ib" then it gives LNET configure error 100:
Network is down.<br>
<br>
It seems that for network=o2ib(ibo) I am getting error Write
failed: Broken pipe.<br>
Am I missing anything while following above steps? Or how do I
resolve above error? <br>
<br>
Thanks,<br>
Aayush. <br>
</div>
</body>
</html>