[lustre-discuss] Problem mounting over infiniband

Chris Hunter chuntera at gmail.com
Thu Apr 28 13:30:50 PDT 2016


Hi Jon,

This is probably result of LU-6735, where special tuning parameters were 
added for truscale IB adapters. The problem is these parameters are 
incompatible with mixed (truescale+mellanox) IB networks.

Lustre 2.8 includes script "ko2iblnd-probe" that will probe IB adapter 
interface and apply tuning parameters dependent on IB adapter model.

One solution is to use o2iblnd defaults. This can be accomplished by 
commenting out line that calls the script in ko2ilnd modprobe config ie) 
file /etc/modprobe.d/ko2iblnd.conf remove line:

install ko2iblnd /usr/sbin/ko2iblnd-probe

If you want to pursue tuning options, you may wish to look at LUDOC-267 
where I listed the (few) presentations about truescale and lustre. Also 
LU-3222 is a good reference for mellanox tuning parameters.


regards,
chris hunter
chuntera at gmail.com


> Hi,
>
> I have brought up a test system using
>
> 2.8.0-3.10.0_327.3.1.el7.x86_64_g96792ba
>
> I can mount the system over tcp, but when I try to do so over infiniband
> i get errors of the type:
>
> Can't accept conn from 10.0.51.1 at o2ib, queue depth too large: 128 (<=8
> wanted)
>
> Can't accept conn from 10.0.51.1 at o2ib (version 12): max_frags 32
> incompatible without FMR pool (256 wanted)
>
> After searching I suspected it had something to do with the fact that we
> have mellanox (mlx4_ib) on the server and qlogic on the client (ib_qib).
>
> Also found a possible solution, by putting
>
> options ko2iblnd peer_credits=124 concurrent_sends=62 map_on_demand=256
>
> However, there are a bunch of options to ko2iblnd, and to me it is not
> obvious which values to chose. Is there a specific strategy one should
> follow?
>
> Regards,
>
> /jon



More information about the lustre-discuss mailing list