[lustre-discuss] lustre causing dropped packets

Raj rajgautam at gmail.com
Tue Dec 5 11:12:48 PST 2017


Brian,
I would check the following:
- MTU size must be same across all the nodes (servers + client)
- peer_credit and credit must be same across all the nodes
- /proc/sys/lnet/peers can show if you are constantly seeing negative
credits
- Buffer overflow counters on the switches if it provide. If the buffer
size is low to handle IO stream, you may want to reduce credits.

-Raj


On Tue, Dec 5, 2017 at 11:56 AM Brian Andrus <toomuchit at gmail.com> wrote:

> Shawn,
>
> Flow control is configured and these connections are all on the same 40g
> subnet and all directly connected to the same switch.
>
> I'm a little new with using lnet_selftest, but as I run it 1:1, I do see
> the dropped packets go up on the client node pretty significantly when I
> run it. The node I set for server does not drop any packets.
>
> Brian Andrus
>
> On 12/5/2017 9:20 AM, Shawn Hall wrote:
>
> Hi Brian,
>
> Do you have flow control configured on all ports that are on the network
> path? Lustre has a tendency to cause packet losses in ways that performance
> testing tools don’t because of the N to 1 packet flows, so flow control is
> often necessary. Lnet_selftest should replicate this behavior.
>
> Is there a point in the network path where the link bandwidth changes
> (e.g. 40 GbE down to 10 GbE, or 2x40 GbE down to 1x40 GbE)? That will
> commonly be the biggest point of loss if flow control isn’t doing its job.
>
> Shawn
>
> On 12/5/17, 11:49 AM, "lustre-discuss on behalf of jongwoohan at naver.com" <lustre-discuss-bounces at lists.lustre.org
> on behalf of jongwoohan at naver.com>
> <lustre-discuss-bounces at lists.lustre.orgonbehalfofjongwoohan@naver.com>
> wrote:
>
> Did you check your connection with iperf and iperf3 in TCP bandwidth? in
> that case, these tools cannot find out packet drops.
>
> Try checking out your block device backend responsibility with benchmark
> tools like vdbench or bonnie++. Sometimes bad block device causes incorrect
> data transfer.
>
> -----Original Message-----
> From: "Brian Andrus"<toomuchit at gmail.com> <toomuchit at gmail.com>
> To: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> <lustre-discuss at lists.lustre.org> <lustre-discuss at lists.lustre.org>;
> Cc:
> Sent: 2017-12-06 (수) 01:38:04
> Subject: [lustre-discuss] lustre causing dropped packets
>
> All,
>
> I have a small setup I am testing (1 MGS, 2 OSS) that is connected via
> 40G ethernet.
>
> I notice that when I run anything that writes to the lustre filesystem
> causes dropped packets. Reads do not seem to cause this. I have also
> tested the network (iperf, iperf3, general traffic) with no dropped
> packets.
>
> Is there something with writes that can cause dropped packets?
>
>
> Brian Andrus
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> *Disclaimer*
>
> This e-mail has been scanned for all viruses and malware, and may have
> been automatically archived by Mimecast Ltd, an innovator in Software as a
> Service (SaaS) for business.
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20171205/b9b723fd/attachment.html>


More information about the lustre-discuss mailing list