[Lustre-discuss] lnet router tuning

Michael Kluge michael.kluge at tu-dresden.de
Mon Sep 13 06:35:23 PDT 2010


Hi Eric,

basically right now I have one IB node, one 10GE node and one router node that has both types of network interfaces.

I've got a small lnet test script on the router node, that does the work:
export LST_SESSION=$$
lst new_session rw
lst add_group readers 192.168.10.8 at tcp
lst add_group writers 10.148.0.94 at o2ib
lst add_batch bulk_rw
lst add_test --batch bulk_rw --from writers --to readers brw read check=simple size=1M
lst run bulk_rw
lst stat writers & sleep 30; kill $!
lst end_session

Is there a way to figure out the messages in flight? I remember to have a "rpc's in flight" tunable but this is connected to the OSC layer which does not do anything in my case (I think).


Michael



Am 13.09.2010 um 03:08 schrieb Eric Barton:

>  
> Michael,
>  
>  
> How are you generating load and measuring the throughput?   I’m particularly interested in the number
> of nodes on each side of the router and how many messages you have in flight between each one.
>  
>  
> Cheers,
>                    Eric
>  
>  
>  
>  
> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Michael Kluge
> Sent: 11 September 2010 12:56 AM
> To: Michael Kluge
> Cc: Lustre Diskussionsliste
> Subject: Re: [Lustre-discuss] lnet router tuning
>  
> And here are my params:
>  
> root at doss05:/home/tests/lnet# for F in /sys/module/lnet/parameters/* ; do echo -n "$F: "; cat $F ; done
> /sys/module/lnet/parameters/accept: secure
> /sys/module/lnet/parameters/accept_backlog: 127
> /sys/module/lnet/parameters/accept_port: 988
> /sys/module/lnet/parameters/accept_timeout: 5
> /sys/module/lnet/parameters/auto_down: 1
> /sys/module/lnet/parameters/avoid_asym_router_failure: 0
> /sys/module/lnet/parameters/check_routers_before_use: 0
> /sys/module/lnet/parameters/config_on_load: 0
> /sys/module/lnet/parameters/dead_router_check_interval: 0
> /sys/module/lnet/parameters/forwarding: enabled
> /sys/module/lnet/parameters/ip2nets: 
> /sys/module/lnet/parameters/large_router_buffers: 512
> /sys/module/lnet/parameters/live_router_check_interval: 0
> /sys/module/lnet/parameters/local_nid_dist_zero: 1
> /sys/module/lnet/parameters/networks: tcp0(eth2),o2ib(ib1)
> /sys/module/lnet/parameters/peer_buffer_credits: 0
> /sys/module/lnet/parameters/portals_compatibility: none
> /sys/module/lnet/parameters/router_ping_timeout: 50
> /sys/module/lnet/parameters/routes: 
> /sys/module/lnet/parameters/small_router_buffers: 8192
> /sys/module/lnet/parameters/tiny_router_buffers: 1024
>  
> I have not used ip2nets but configure routing but put explict routing statements into the modprobe.d/ files. Is that OK? 
>  
>  
> Michael
>  
>  
> Am 10.09.2010 um 17:48 schrieb Michael Kluge:
> 
> 
> OK, IB back to back is at 1,2 GB/s, 10GE back to back at 950 MB/s, with additional lnet router I see 550 MB/s. Time for lnet tuning?
>  
> Michael
> 
> 
> Hi Andreas,
>  
> Am 10.09.2010 um 16:35 schrieb Andreas Dilger:
> 
> 
> On 2010-09-10, at 08:23, Michael Kluge wrote:
> 
> I have a Lustre 1.8.3 setup where I'd like to some lnet router performance tests with routing between DDR IB<->10GE networks. Currently I have three nodes, one with DDR IB, one with 10GE and one with both that does the routing. A first short lnet test shows 520-550 MB/s performance.
>  
> Has anyone an idea which of the variables of the lnet module are worth playing with to get this number a bit closer to 1GB/s?
> 
> I would start by testing the performance on just the 10GigE side, and then separately on the IB side, to verify you are getting the expected performance from the components before trying them both together.  Often it is necessary to tune the ethernet send/receive buffers.
>  
> Ethernet back to back is at 950 MB/s. I have not looked at IB back to back yet.
>  
>  
> Michael
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universität Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
>  
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>  
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universität Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
>  
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>  
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universität Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
>  
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100913/894a7c25/attachment.htm>


More information about the lustre-discuss mailing list