[Lustre-discuss] Lustre voltaire configuration

Aielli Roberto r.aielli at cineca.it
Tue Oct 6 03:05:31 PDT 2009


Hi Dennis,
I want to test both configurations: IPoIB and after RDMA.
I've reloaded the modules after the modification, but probably some more
configurations needs to be done.
   
    Thanks, Roberto

Dennis Nelson wrote:
> On 10/5/09 7:58 AM, "Aielli Roberto" <r.aielli at cineca.it> wrote:
>
>   
>> Hi,
>> I'm trying to configure Lustre 1.8.1 with a Voltaire infiniband network.
>> On the MGS, MDS and OSSs I have two interfaces: eth1 and ib0. I've
>> successfully  completed a test using eth1 so I've mounted a filesystem
>> on client node. Now I want to do the same thing with Voltaire infiniband
>> (ib0) modifying the modprobe.conf on both servers an clients with the line:
>>
>> options lnet networks=tcp(ib0)
>>     
>
> This would use IPoIB, not native infiniband.  Is that what you want?  To do
> native infiniband, asssuming you are using OFED, you need to specify:
>
> options lnet networks=o2ib(ib0)
>
> Did you unload/reload the Lustre modules after changing modprobe.conf.local?
> If not, it would not recognize the changes in modprobe.conf.local.
>
>   
>> When I try to mount the FS on the client node nothing happen and I find
>> the following error in the syslog:
>>
>> Oct  5 13:08:12 xc264 kernel: Lustre:
>> 5468:0:(linux-tcpip.c:688:libcfs_sock_connect()) Error -101 connecting
>> 0.0.0.0/1023 -> 172.31.1.25/988
>>
>> Oct  5 13:08:12 xc264 kernel: Lustre:
>> 5468:0:(acceptor.c:95:lnet_connect_console_error()) Connection to
>> 172.31.1.25 at tcp at host 172.31.1.25 was unreachable: the network or that node
>> may be down, or
>>
>>  Lustre may be misconfigured.
>>
>> Oct  5 13:08:12 xc264 kernel: Lustre:
>> 5468:0:(socklnd_cb.c:421:ksocknal_txlist_done()) Deleting packet type 1 len
>> 368 172.31.65.24 at tcp->172.31.1.25 at tcp
>>
>> Oct  5 13:08:17 xc264 kernel: Lustre:
>> 5474:0:(client.c:1383:ptlrpc_expire_one_request()) @@@ Request
>> x1315690795499541 sent from lustre-MDT0000-mdc-ffff81021f97e400 to NID
>> 172.31.1.25 at tcp 5s ago ha
>>
>> s timed out (limit 5s).
>>
>> Oct  5 13:08:17 xc264 kernel:   req at ffff81021828fc00 x1315690795499541/t0
>> o38->lustre-MDT0000_UUID at 172.31.1.25@tcp:12/10 lens 368/584 e 0 to 1 dl
>> 1254740897 ref 1 fl Rpc:N/0/0 rc 0/0
>>
>>
>> The main problem is displayed on the first line. The MGS ib0 address is
>> 172.31.65.25 but as you can see the client always try to connect the
>> eth1 address (172.31.1.25) even shutting down eth1.
>>
>> Placing in modprobe.conf a line with vib(ib0) the problem is different
>> I've also tried to modify modprobe.conf by changing the options line to:
>>
>> options lnet networks=vib(ib0)
>>
>>
>> but in the syslog I've found:
>>
>> Oct  5 12:33:19 xc264 kernel: LustreError:
>> 4864:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load LND vib, module
>> kviblnd
>>
>>
>> Are there other configurations for IB that I forgot to make/modify?
>>
>> Thanks in advance for your help
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>     
>
>
>  



More information about the lustre-discuss mailing list