[Lustre-discuss] Lustre-1.8.1.1 over o2ib gives Input/Output error while executing lctl ping

rishi pathak mailmaverick666 at gmail.com
Sun Feb 14 22:53:23 PST 2010


Hello Vipul,


On Fri, Feb 12, 2010 at 7:23 PM, Vipul Pandya <vipul at chelsio.com> wrote:

>  Hi All,
>
>
>
> I am trying to run Lustre over iWARP. For this I have compiled
> Lustre-1.8.1.1 with linux-2.6.18-128.7.1 source and OFED-1.5 source.
>
> I have installed all the required rpms for lustre.
>
>
>
> After this I booted into  the lustre patched kernel and gave the following
> option in /etc/modprobe.conf for lnet to work with o2ib
>
> #> cat /etc/modprobe.conf
>
> options lnet networks="o2ib0(eth2)"
>
I am not familiar with Lustre over iWARP interconnect but still is eth2 the
device associated with IP over iWARP .

>
>
> I loaded our RDMA adapter modules and the lnet and ko2iblnd modules as
> follows:
>
> #> modprobe cxgb3
>
> #> modprobe iw_cxgb3
>
> #> modprobe rdma_ucm
>
> #> modprobe lnet
>
> #> modprobe ko2iblnd
>
>
>
> I was able to load all the modules successfully.
>
>
>
> Then I assigned the ip address to eth2 interface and brought it up
>
> #> ifconfig eth2 102.88.88.188/24 up
>
> #> ifconfig
>
> eth0      Link encap:Ethernet  HWaddr 00:30:48:C7:8F:8E
>
>           inet addr:10.193.184.188  Bcast:10.193.187.255
> Mask:255.255.252.0
>
>           inet6 addr: fe80::230:48ff:fec7:8f8e/64 Scope:Link
>
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>
>           RX packets:13224 errors:0 dropped:0 overruns:0 frame:0
>
>           TX packets:797 errors:0 dropped:0 overruns:0 carrier:0
>
>           collisions:0 txqueuelen:1000
>
>           RX bytes:1523344 (1.4 MiB)  TX bytes:203205 (198.4 KiB)
>
>           Memory:dea20000-dea40000
>
>
>
> eth2      Link encap:Ethernet  HWaddr 00:07:43:05:07:35
>
>           inet addr:102.88.88.188  Bcast:102.88.88.255  Mask:255.255.255.0
>
>           inet6 addr: fe80::207:43ff:fe05:735/64 Scope:Link
>
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>
>           RX packets:153 errors:0 dropped:0 overruns:0 frame:0
>
>           TX packets:47 errors:0 dropped:0 overruns:0 carrier:0
>
>           collisions:0 txqueuelen:1000
>
>           RX bytes:22537 (22.0 KiB)  TX bytes:8500 (8.3 KiB)
>
>           Interrupt:185 Memory:de801000-de801fff
>
>
>
> lo        Link encap:Local Loopback
>
>           inet addr:127.0.0.1  Mask:255.0.0.0
>
>           inet6 addr: ::1/128 Scope:Host
>
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>
>           RX packets:1607 errors:0 dropped:0 overruns:0 frame:0
>
>           TX packets:1607 errors:0 dropped:0 overruns:0 carrier:0
>
>           collisions:0 txqueuelen:0
>
>           RX bytes:3196948 (3.0 MiB)  TX bytes:3196948 (3.0 MiB)
>
>
>
> After this I tried to bring the lnet network up as follows:
>
> #> lctl network up
>
> LNET configured
>
>
>
> Above command gave me following error in dmesg
>
> #> dmesg
>
> Lustre: Listener bound to eth2:102.88.88.188:987:cxgb3_0
>
> Lustre: Register global MR array, MR size: 0xffffffff, array size: 2
>
> fmr_pool: Device cxgb3_0 does not support FMRs
>
> LustreError: 4134:0:(o2iblnd.c:1393:kiblnd_create_fmr_pool()) Failed to
> create FMR pool: -38
>
> Lustre: Added LNI 102.88.88.188 at o2ib [8/64/0/0]
>
>
>
> I repeat the same procedure on the other node of lustre and found the same
> result.
>
> Then I tried to do lctl ping between two nodes of lustre, which gave me
> following error:
>
>
>
> #> lctl ping 102.88.88.184 at o2ib
>
> failed to ping 102.88.88.184 at o2ib: Input/output error
>
>
>
> dmesg has shown following error:
>
> #> dmesg
>
> LustreError: 2453:0:(o2iblnd.c:801:kiblnd_create_conn()) Can't create QP:
> -12, send_wr: 2056, recv_wr: 18
>
> Lustre: 2453:0:(o2iblnd_cb.c:1953:kiblnd_peer_connect_failed()) Deleting
> messages for 102.88.88.184 at o2ib: connection failed
>
>
>
> I found one thread where it has given the patch to support FMR in o2ib. But
> I don’t think this patch is applicable for lustre-1.8.1.1.
>
> http://lists.lustre.org/pipermail/lustre-discuss/2008-February/006502.html
>
>
>
> Can anyone please guide me on this.
>
>
>
> Thank you very much in advance.
>
> Vipul
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>


-- 
Regards--
Rishi Pathak
National PARAM Supercomputing Facility
Center for Development of Advanced Computing(C-DAC)
Pune University Campus,Ganesh Khind Road
Pune-Maharastra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100215/00271e02/attachment.htm>


More information about the lustre-discuss mailing list