[Lustre-discuss] Using Infiniband QoS with Lustre 1.8.5

Isaac Huang he.h.huang at oracle.com
Tue Feb 8 13:41:23 PST 2011


On Tue, Feb 08, 2011 at 05:44:35PM +0100, Ramiro Alba wrote:
> Hi everybody,
> 
> We have a 128 nodes (8 cores/node) 4x DDR IB cluster with 2:1
> oversubscription and I use the IB net for:
> 
> - OpenMPI
> - Lustre
> - Admin (may change in future)
> 
> I'am very interested in using IB QoS, as in the near future I'm
> deploying ADM processors having then 24 cores /node so I want to put a
> barrier to trafic so as no trafic (specially OpenMPI) is starved by
> others (specially Lustre I/O). So I read all the documentation I could

My own experience was that Lustre traffic often fell victim of
aggressive MPI behavior, especially during collective communications.

> ----- /etc/opensm/qos-policy.conf --------------------
> 
> 
> # SL asignation to Flows. GUIDs are Port GUIDs
> qos-ulps
>     default                                 :0    # default SL (OPENMPI)
>     any, target-port-guid 0x0002c90200279295  :1    # SL for Lustre MDT
>     any, target-port-guid 0x0002c9020029fda9,0x0002c90200285ed5    :2
> # SL for Lustre OSTs
>     ipoib                             :3    # SL for Administration 
> end-qos-ulps

My understanding is that SL is determined only once for each
connected QP, which Lustre uses, during connection establishment. The
configuration above seemed to me to be able to catch connections from
clients to servers but not the other way. Servers do connect to
clients though that's not the usual case. Moreover, Lustre QPs are
persistent. So you might end up with quite some Lustre QPs in the
default SL. I've never done any IB QoS configuration, but it'd be
good to double check that the config above does catch all connections.

If servers run not just Lustre, it's possible to distinguish ULP
traffic further by the Lustre ServiceID. If servers serve more than
one Lustre file system, you can divide the traffic further by
assigning each file system a different PKey. But it's probably beyond
your concerns.

Cheers,
Isaac



More information about the lustre-discuss mailing list