[lustre-discuss] ko2iblnd.conf
Daniel Szkola
dszkola at fnal.gov
Thu Apr 11 11:02:01 PDT 2024
On the server node(s):
options ko2iblnd-opa peer_credits=32 peer_credits_hiw=16 credits=1024 concurrent_sends=64 ntx=2048 map_on_demand=256 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
On clients:
options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
My concern isn’t so much the mismatch because I know that’s an issue but rather what numbers we should settle on with a recent lustre build. I also see the ko2iblnd-opa in the server config, which means because the server is actually loading ko2iblnd that maybe defaults are used?
What made me look was we were seeing lots of:
LNetError: 2961324:0:(o2iblnd_cb.c:2612:kiblnd_passive_connect()) Can't accept conn from xxx.xxx.xxx.xxx at o2ib2, queue depth too large: 42 (<=32 wanted)
—
Dan Szkola
FNAL
> On Apr 11, 2024, at 12:36 PM, Andreas Dilger <adilger at whamcloud.com> wrote:
>
> [EXTERNAL] – This message is from an external sender
>
>
> On Apr 11, 2024, at 09:56, Daniel Szkola via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
>>
>> Hello all,
>>
>> I recently discovered some mismatches in our /etc/modprobe.d/ko2iblnd.conf files between our clients and servers.
>>
>> Is it now recommended to keep the defaults on this module and run without a config file or are there recommended numbers for lustre-2.15.X?
>>
>> The only thing I’ve seen that provides any guidance is the Lustre wiki and an HP/Cray doc:
>>
>> https://www.hpe.com/psnow/resources/ebooks/a00113867en_us_v2/Lustre_Server_Recommended_Tuning_Parameters_4.x.html
>>
>> Anyone have any sage advice on what the ko2iblnd.conf (and possibly ko2iblnd-opa.conf and hfi1.conf as well) on modern systems?
>
> It would be useful to know what specific settings are mismatched. Definitely some of them need to be consistent between peers, others depend on your system.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
More information about the lustre-discuss
mailing list