[lustre-discuss] weird issue w. lnet routers

Jeff Johnson jeff.johnson at aeoncomputing.com
Tue Nov 28 17:40:25 PST 2017


John,

I can't speak to Fragella's tuning making things worse but...

Have you run iperf3 and lnet_selftest from your Ethernet clients to each of
the lnet routers to establish what your top end is? It'd be good to
determine if you have an Ethernet problem vs a lnet problem.

Also, are you running Ethernet rdma? If not interrupts on the receive end
can be vexing.

--Jeff

On Tue, Nov 28, 2017 at 17:21 John Casu <john at chiraldynamics.com> wrote:

> just built a system w. lnet routers that bridge Infiniband & 100GbE, using
> Centos built in Infiniband support
> servers are Infiniband, clients are 100GbE (connectx-4 cards)
>
> my direct write performance from clients over Infiniband is around 15GB/s
>
> When I introduced the lnet routers, performance dropped to 10GB/s
>
> Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
> performance dropped to 3GB/s.
>
> When I tuned according to John Fragella's LUG slides, things went even
> slower (1.5GB/s write)
>
> does anyone have any ideas on what I'm doing wrong??
>
> thanks,
> -john c.
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20171129/9b6c31a4/attachment.html>


More information about the lustre-discuss mailing list