[Lustre-discuss] [Lustre-devel] Meaning of LND/neterrors ?

Alexey Lyashkov alexey.lyashkov at clusterstor.com
Wed Sep 22 11:20:57 PDT 2010


Hi Aurelien,

That message you can see in two cases
1) low level network error, that bad - because client will be reconnected and resend requests after that error.
that will add extra load to the service nodes.

2) service node (MDS, OSS) is restarted or hung, at that case transfer aborted.
 

On Sep 22, 2010, at 19:20, Aurelien Degremont wrote:

> Hello
> 
> I've noticed that Lustre network error, especially LND errors, are considered as maskable errors.
> That means that on a production node, where debug mask is 0, those specific errors won't be displayed if they happened.
> 
> Does that mean that they are harmless?
> Do upper-layers resend their RPC/packet if LNDs report an error?
> 
> When, in my case, o2iblnd says something like "RDMA failed" (neterror). It is a big issue? Some RPC were lost or not?
> 
> Thanks in advance
> 
> -- 
> Aurelien Degremont
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel




More information about the lustre-discuss mailing list