[lustre-discuss] LNET Self-test

Jon Tegner tegner at foi.se
Mon Feb 6 04:40:05 PST 2017


Hi,

I used the following script:

#!/bin/bash
export LST_SESSION=$$
lst new_session read/write
lst add_group servers 10.0.12.12 at o2ib
lst add_group readers 10.0.12.11 at o2ib
lst add_group writers 10.0.12.11 at o2ib
lst add_batch bulk_rw
lst add_test --batch bulk_rw --concurrency 12 --from readers --to servers \
brw read check=simple size=1M
lst add_test --batch bulk_rw --concurrency 12 --from writers --to servers \
brw write check=simple size=1M
# start running
lst run bulk_rw
# display server stats for 30 seconds
lst stat servers & sleep 30; kill $!
# tear down
lst end_session

and tried with concurrency from 0,2,4,8,12,16, results in

http://renget.se/lnetBandwidth.png
and
http://renget.se/lnetRates.png

 From Bandwidth a max of just below 2800 MB/s can be noted. Since in 
this case "readers" and "writers" are the same, I did a few tests with 
the line

lst add_test --batch bulk_rw --concurrency 12 --from writers --to servers \
brw write check=simple size=1M

removed from the script - which resulted in a bandwidth of around 3600 MB/s.

I also did tests using mpitests-osu_bw from openmpi, and in that case I 
monitored a bandwidth of about 3900 MB/s.

Considering the "openmpi-bandwidth" should I be happy with the numbers 
obtained by LNet selftest? Is there a way to modify the test so that the 
result gets closer to what openmpi is giving? And what can be said of 
the "Rates of servers (RPC/s)" - are they "good" or "bad"? What to 
compare them with?

Thanks!

/jon

On 02/05/2017 08:55 PM, Jeff Johnson wrote:
> Without seeing your entire command it is hard to say for sure but I would make sure your concurrency option is set to 8 for starters.
>
> --Jeff
>
> Sent from my iPhone
>
>> On Feb 5, 2017, at 11:30, Jon Tegner <tegner at foi.se> wrote:
>>
>> Hi,
>>
>> I'm trying to use lnet selftest to evaluate network performance on a test setup (only two machines). Using e.g., iperf or Netpipe I've managed to demonstrate the bandwidth of the underlying 10 Gbits/s network (and typically you reach the expected bandwidth as the packet size increases).
>>
>> How can I do the same using lnet selftest (i.e., verifying the bandwidth of the underlying hardware)? My initial thought was to increase the I/O size, but it seems the maximum size one can use is "--size=1M".
>>
>> Thanks,
>>
>> /jon
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list