[lustre-discuss] LNET Conf Advise and Rearchitecting

Paul Edmon pedmon at cfa.harvard.edu
Thu Apr 4 08:49:12 PDT 2019


I was hoping to get some advise on how to write a valid lnet.conf and 
also how we should rearchitect our current LNET layout.  I tried 
following the Lustre docs for lnet.conf but they were not helpful and I 
ended up not being able to successfully set up a LNET router 
programmatically.  You can see my attempt to do so in puppet here:

https://github.com/fasrc/puppet-lnet

I'm pretty sure I am missing something but I don't know what.

As for our current architecture it is as follows.  Right now we have two 
data centers separated by 100 km each with Lustre filesystems in them 
and their own IB islands.  To complicate matters we will have a third IB 
island coming online soon as well, so what we set up should be 
extensible.  I want to code this in Puppet so I can easily lay down new 
lnet.conf's and spin up new LNET layers.  Here are the systems in each 
place as well as the Lustre versions.

Boston

boslfs: 5PB Lustre IEEL Filesystem, Lustre 2.5.34, IB only export, 
routed via boslnet[01-02] as o2ib1

boslnet[01,02]: Lustre 2.5.34 bridges boslfs IB to our 10 GBE ethernet 
network

Holyoke

holylfs: 5PB Lustre IEEL Filesystem, Lustre 2.5.34, IB only export, 
routed via holylnet[01-02] as o2ib0

holylfs02: 5PB Lustre Filesystem, Lustre 2.10.4, IB only export, routed 
via holylnet[03-04] as o2ib2

holylfs03: 3PD Lustre Filesystem, Lustre 2.10.6, IB only export, routed 
via holylnet[01-02] as o2ib0

scratchlfs: 2PB, DDN Exascaler, Lustre 2.10.5, IB only export, routed 
via holylnet[01-02] as o2ib0

holylnet[01-04]: Lustre 2.5.34, bridges FDR IB to GbE ethernet network

panlfs2, panlfs3, kuanglfs: Various Lustre 2.5.3, exported via IB and 
GbE do not use LNET routers, mounted via o2ib or tcp0 (depending on if 
they are GbE or IB connected)

All these systems are exported to the compute which live in both 
datacenters both on and off of the IB fabric.  The compute is running 
Lustre 2.10.3.  As noted soon we will have an independent IB fabric at 
Holyoke that is EDR/HDR that we want to bridge with the FDR fabric using 
an LNET router as well.

What I would like to do is both bridge the Boston network from IB to GbE 
and then back to IB (right now it just does IB to GbE), make all the 
Lustre hosts at Holyoke that aren't dual homed use the same block of 
LNET routers that we can expand easily and programatically, finally lay 
the ground work for the LNET bridging from FDR to EDR fabrics.  It would 
also be good to use Lustre 2.10.x for the routers or whatever the latest 
version is so we an use lnet.conf.  I tried this on my own but I 
couldn't get a conf that worked even though I thought I had one.  I 
wasn't sure what I was doing wrong.

If you like I can share with you our current configs but given that I'm 
happy to throw them out I'm good with just scrapping all I have to start 
from scratch and do it right.  Also happy to send a diagram if that 
would be helpful.

Thanks for your help in advance!

-Paul Edmon-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190404/0c92b72e/attachment.html>


More information about the lustre-discuss mailing list