[Lustre-discuss] controlling which eth interface lustre uses

Thu Oct 21 07:04:49 PDT 2010

On Oct 21, 2010, at 9:51 AM, Brock Palen wrote:

> On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:
>
>> On 10/21/2010 09:37 AM, Brock Palen wrote:
>>> We recently added a new oss, it has 1 1Gb interface and 1 10Gb
>>> interface,
>>>
>>> The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
>>> 10.164.0.10
>>
>> They look like they are on the same subnet if you are using /24 ...
>
> You are correct
>
> Both interfaces are on the same subnet:
>
> [root at oss4-gb ~]# route
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref     
> Use Iface
> 10.164.0.0      *               255.255.248.0   U     0       
> 0        0 eth0
> 10.164.0.0      *               255.255.248.0   U     0       
> 0        0 eth4
> 169.254.0.0     *               255.255.0.0     U     0       
> 0        0 eth4
> default         10.164.0.1      0.0.0.0         UG    0       
> 0        0 eth0
>
> There is no way to mask the lustre service away from the 1Gb  
> interface?

We struggle with this as well but have not found a way to enforce  
it.   You would think that lustre would honor the NID for incoming  
*and* outgoing traffic but apparently the standard linux routing table  
determines the outbound path and lnet is out of the picture.     Thus,  
you end up having to assign separate subnets, shut down your eth0 (in  
this case) interface, or use static routes to fine tune the routing  
decisions (where possible).

We wish that the outgoing decision could be made on the basis of the  
*NID* but that might be too intrusive with regard to the linux  
kernel's network stack so I can understand, somewhat, why it is not  
that way.   Still, it is somewhat counter-intuitive to go through all  
the trouble of having the LNET layer and assigning NIDs only to have  
them disregarded for outbound traffic.

Perhaps there is a way around this that we don't know about.

Regards,

Charlie Taylor
UF HPC Center