[Lustre-discuss] [HPDD-discuss] lustre lnet infiniband config

aayush agrawal aayush.agrawal at calsoftinc.com
Tue Sep 30 07:46:34 PDT 2014


Hi Parinay,

Yes, I see ib0 in output of ifconfig -a.
I also tried with options lnet networks=*o2ib_0_*(ib0) but no luck.
While loading lnet I do see error in var/log/messages:

kernel: LNet: HW CPU cores: 32, npartitions: 4
alg: No test for crc32 (crc32-table)
kernel: alg: No test for adler32 (adler32-zlib)
kernel: alg: No test for crc32 (crc32-pclmul)
kernel: padlock: VIA PadLock Hash Engine not detected.
modprobe: FATAL: Error inserting padlock_sha 
(/lib/modules/2.6.32_358/kernel/drivers/crypto/padlock-sha.ko): No such 
device

But as per below link this should not be a problem?
https://jira.hpdd.intel.com/browse/LU-1599

modprobe lnet completes successfully and I see "Write failed: Broken 
pipe" after running "lctl network up" and after this session gets logout 
from the server.

Thanks,
Aayush.

On 9/30/2014 7:21 PM, Parinay Kondekar wrote:
> - what is the output of 'ifconfig -a' , do you see ib0  there ? 
> mentioning 'options lnet networks=*o2ib_0_*(ib0)'**should be enough.
> - anything in syslog ?
>
> HTH
>
> On Tue, Sep 30, 2014 at 6:03 PM, aayush agrawal 
> <aayush.agrawal at calsoftinc.com <mailto:aayush.agrawal at calsoftinc.com>> 
> wrote:
>
>     Hi,
>
>     I am trying to build lustre 2.5.0 against
>     MLNX_OFED_LINUX-2.2-1.0.1-rhel6.4-x86_64 on CentOS6.4 with kernel
>     version 2.6.32-358.
>     But I am not able to set lnet config settings properly. I used
>     settings suggested in lustre 2.x manual. But then not able to get
>     network up using lctl.
>
>     Details:
>
>     I have two server machines, one for mgs+mdt and second for oss and
>     one client machine. I want to setup Infiniband on all these machines.
>     I could run below steps successfully for all the three machines:
>     1. Run script mlnxofedinstall
>     # ./mlnxofedinstall  -vvv --add-kernel-support --without-32bit
>     --without-fw-update --hpc
>     2. Restart openibd service
>     #  /etc/init.d/openibd restart
>     3. configure ib0 interface.
>     4. configure lustre with o2ib
>     # ./configure --with-linux=Path_to_linux-2.6.32-358.18.1.el6
>     --with-o2ib=/usr/src/ofa_kernel/default/
>
>     5. make lustre rpms:
>         # make rpms
>     This gave me below compilation error
>     I looked online for this error and found bug registered on the
>     same: https://jira.hpdd.intel.com/browse/LU-4266
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.hpdd.intel.com_browse_LU-2D4266&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=Gu0enSN8vm3fdyqEtx0cJjPMhWf9o_TCXmJhHez9HKE&e=>
>     Below patch from above link solved the problem and hence I could
>     build lustre rpms:
>     http://review.whamcloud.com/#/c/8451/1
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__review.whamcloud.com_-23_c_8451_1&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=BqWJdkdWSRVMHWQkLWAhYaV0yfRwJZDUb61TfAgRss0&e=>
>
>     Now first I want to do the Infiniband setup for mgs and mdt on
>     single machine which also has Ethernet IP. Then I want to format
>     and mount mgs and mdt.
>     So I installed above created lustre rpms and then added below line
>     in /etc/modprobe.d/lustre.conf
>     options lnet networks=o2ib(ib0)
>
>     Then I rebooted the machine to remove all lustre related modules
>     including lnet and then ranmodprobe lnet command to add above
>     parameters and the ran lctl network up which is giving me below error:
>     LNET configure error 100: Network is down
>
>     I looked online and found below discussion on same error:
>     http://lists.lustre.org/pipermail/lustre-discuss/2010-June/013510.html
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_pipermail_lustre-2Ddiscuss_2010-2DJune_013510.html&d=AAMCAw&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=aCgXfqCUyJ7IXVRJHjqpk2HCS1_dsKDuaKJrDPmWp4I&e=>
>
>     As per suggestion in above mail I tried with below line in
>     /etc/modprobe.d/lustre.conf. In below command for IB_IP, I have
>     given infiniband IP.
>     options lnet *networks=o2ib(ib0)* routes="tcp0 IB_IP at o2ib"
>     This command hangs for around 2 to 3 minutes and then gives error:
>     Write failed: Broken pipe. Same is the case for "options lnet
>     *networks=o2ib(ib0)*"
>     But if I set: options lnet *networks=tcp0(eth0),o2ib(ib0)*
>     routes="tcp1 IB_IP at o2ib" then it gives LNET configure error 100:
>     Network is down.
>
>     It seems that for network=o2ib(ibo) I am getting error Write
>     failed: Broken pipe.
>     Am I missing anything while following above steps? Or how do I
>     resolve above error?
>
>     Thanks,
>     Aayush.
>
>     <html>
>     _______________________________________________
>     HPDD-discuss mailing list
>     HPDD-discuss at lists.01.org <mailto:HPDD-discuss at lists.01.org>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.01.org_mailman_listinfo_hpdd-2Ddiscuss&d=AAICAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=c-1Cg_VH2lcYI_JXS3gypPA6xWmYsO2Md6-EoqjeIzk&m=q_uNuYFdGrDiFyB8x0KjRuPV4TbYGJf20PKQKambrfE&s=0hW3r7x0NhgbZ7zgaZKr9K_fk7_E8bs0f-GAlH89rgM&e=
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20140930/7f61a492/attachment.htm>


More information about the lustre-discuss mailing list