[Lustre-discuss] lnet router tuning

Eric Barton eeb at whamcloud.com
Mon Sep 13 07:55:34 PDT 2010


Michael,

 

I think you may have only got 1 BRW READ in flight at a time with this script,

so I would expect the routed throughput to be getting on for half of direct

throughput.  Can you try “--concurrency 8” to simulate the number of I/Os

a real client would keep in flight?

 

Cheers,
                   Eric 

 

 From: Michael Kluge [mailto:michael.kluge at tu-dresden.de] 
Sent: 13 September 2010 10:35 PM
To: Eric Barton
Cc: 'Lustre Diskussionsliste'
Subject: Re: [Lustre-discuss] lnet router tuning

 

Hi Eric,

 

basically right now I have one IB node, one 10GE node and one router node that has both types of network interfaces.

 

I've got a small lnet test script on the router node, that does the work:

export LST_SESSION=$$

lst new_session rw

lst add_group readers 192.168.10.8 at tcp

lst add_group writers 10.148.0.94 at o2ib

lst add_batch bulk_rw

lst add_test --batch bulk_rw --from writers --to readers brw read check=simple size=1M

lst run bulk_rw

lst stat writers & sleep 30; kill $!

lst end_session

 

Is there a way to figure out the messages in flight? I remember to have a "rpc's in flight" tunable but this is connected to the OSC
layer which does not do anything in my case (I think).

 

 

Michael

 

 

 

Am 13.09.2010 um 03:08 schrieb Eric Barton:





 

Michael,

 

 

How are you generating load and measuring the throughput?   I’m particularly interested in the number

of nodes on each side of the router and how many messages you have in flight between each one.

 

 

Cheers,
                   Eric

 

 

 

 

From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge
Sent: 11 September 2010 12:56 AM
To: Michael Kluge
Cc: Lustre Diskussionsliste
Subject: Re: [Lustre-discuss] lnet router tuning

 

And here are my params:

 

root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done

/sys/module/lnet/parameters/accept: secure

/sys/module/lnet/parameters/accept_backlog: 127

/sys/module/lnet/parameters/accept_port: 988

/sys/module/lnet/parameters/accept_timeout: 5

/sys/module/lnet/parameters/auto_down: 1

/sys/module/lnet/parameters/avoid_asym_router_failure: 0

/sys/module/lnet/parameters/check_routers_before_use: 0

/sys/module/lnet/parameters/config_on_load: 0

/sys/module/lnet/parameters/dead_router_check_interval: 0

/sys/module/lnet/parameters/forwarding: enabled

/sys/module/lnet/parameters/ip2nets: 

/sys/module/lnet/parameters/large_router_buffers: 512

/sys/module/lnet/parameters/live_router_check_interval: 0

/sys/module/lnet/parameters/local_nid_dist_zero: 1

/sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1)

/sys/module/lnet/parameters/peer_buffer_credits: 0

/sys/module/lnet/parameters/portals_compatibility: none

/sys/module/lnet/parameters/router_ping_timeout: 50

/sys/module/lnet/parameters/routes: 

/sys/module/lnet/parameters/small_router_buffers: 8192

/sys/module/lnet/parameters/tiny_router_buffers: 1024

 

I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? 

 

 

Michael

 

 

Am 10.09.2010 um 17:48 schrieb Michael Kluge:






OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning?

 

Michael






Hi Andreas,

 

Am 10.09.2010 um 16:35 schrieb Andreas Dilger:






On 2010-09-10, at 08:23, Michael Kluge wrote:




I have a Lustre 1.8.3 setup where I'd like to some lnet router performance tests with routing between DDR IB<->10GE networks.
Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows
520-550 MB/s performance.

 

Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s?


I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the
expected performance from the components before trying them both together.  Often it is necessary to tune the ethernet send/receive
buffers.

 

Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet.

 

 

Michael


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

 

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

 


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

 

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

 


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

 

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

 


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100913/232b24af/attachment.htm>


More information about the lustre-discuss mailing list