[Lustre-discuss] Using Infiniband QoS with Lustre 1.8.5

Ramiro Alba raq at cttc.upc.edu
Wed Feb 9 00:13:50 PST 2011


On Tue, 2011-02-08 at 14:41 -0700, Isaac Huang wrote:
> On Tue, Feb 08, 2011 at 05:44:35PM +0100, Ramiro Alba wrote:
> > Hi everybody,
> > 
> > We have a 128 nodes (8 cores/node) 4x DDR IB cluster with 2:1
> > oversubscription and I use the IB net for:
> > 
> > - OpenMPI
> > - Lustre
> > - Admin (may change in future)
> > 
> > I'am very interested in using IB QoS, as in the near future I'm
> > deploying ADM processors having then 24 cores /node so I want to put a
> > barrier to trafic so as no trafic (specially OpenMPI) is starved by
> > others (specially Lustre I/O). So I read all the documentation I could
> 
> My own experience was that Lustre traffic often fell victim of
> aggressive MPI behavior, especially during collective communications.
> 
> > ----- /etc/opensm/qos-policy.conf --------------------
> > 
> > 
> > # SL asignation to Flows. GUIDs are Port GUIDs
> > qos-ulps
> >     default                                 :0    # default SL (OPENMPI)
> >     any, target-port-guid 0x0002c90200279295  :1    # SL for Lustre MDT
> >     any, target-port-guid 0x0002c9020029fda9,0x0002c90200285ed5    :2
> > # SL for Lustre OSTs
> >     ipoib                             :3    # SL for Administration 
> > end-qos-ulps
> 
> My understanding is that SL is determined only once for each
> connected QP, which Lustre uses, during connection establishment. The
> configuration above seemed to me to be able to catch connections from
> clients to servers but not the other way. Servers do connect to
> clients though that's not the usual case. Moreover, Lustre QPs are
> persistent. So you might end up with quite some Lustre QPs in the
> default SL. I've never done any IB QoS configuration, but it'd be

Ok, but the question is if this unwanted traffic going to default is
meaningful enough. What do you think?

> good to double check that the config above does catch all connections.
> 
> If servers run not just Lustre, it's possible to distinguish ULP
> traffic further by the Lustre ServiceID. If servers serve more than

Yes. I saw this possibility a the lustre mailing list:

http://lists.lustre.org/pipermail/lustre-discuss/2009-May/010563.html

but it is said it has a drawback:

..........................................................................
The next step is to tell OpenSM to assign an SL to this service-id.
Here is an extract of our "QoS policy file":
qos-ulps
    default                                                     : 0
    any, service-id=0x.....: 3
end-qos-ulps

The major drawback of this solution is that the modification we made in 
the ofa-kernel is not OpenFabrics Alliance compliant, because the 
portspace list is defined in the IB standard.
...........................................................................

> one Lustre file system, you can divide the traffic further by

That's not my case at the moment.

> assigning each file system a different PKey. But it's probably beyond
> your concerns.

What do you thing about the 'weights' policy I've suggested in my
configuration?

Thanks for your answer
Kind Regards

-- 
Ramiro Alba

Centre Tecnològic de Tranferència de Calor
http://www.cttc.upc.edu


Escola Tècnica Superior d'Enginyeries
Industrial i Aeronàutica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46



-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.




More information about the lustre-discuss mailing list