[Lustre-discuss] IB storage as an OST target

David Dillow dave at thedillows.org
Wed Mar 30 09:00:04 PDT 2011


On Wed, 2011-03-30 at 10:44 -0400, chas williams - CONTRACTOR wrote:
> On Tue, 29 Mar 2011 09:23:59 -0400
> Jason Hill <hilljj at ornl.gov> wrote:
> 
> > On Mon, Mar 28, 2011 at 09:36:43AM -0400, Jason Hill wrote:
> > storage as well. I would definately suggest putting your LNET and SRP 
> > connections on different physical HCA's to keep the traffic at least isolated 
> > on the OSS side. 
> 
> i doubt this matters as much as it once did.  pci/pci-x wasnt capacble
> of reading/writing at the same time due to its bus nature (even though
> pci-x was point-to-point to the bridge chip).  pci-express (and
> infiniband) can both read and write at the same time.  so you can
> stream in data from srp and stream it out via lnet at the same time on
> the same port.

While it is true both are full duplex, there are also setup messages
flowing in both directions to set up the large transfers. In the past,
we've certainly seen problems at scale with small messages getting
blocked behind large bulk traffic on LNET. It would be interesting to
see how much self-interference is generated when running storage over
the same HCA as LNET, versus having them on separate NICs -- especially
when we're maxing out NIC capacity at peak demand. That experiment could
give some good guidance as to whether or not this is actually something
to worry about.

> a bigger concern might be the number of luns behind a single port on
> your storage controller.  most people have more ost's/oss's than ports
> on the storage controllers.

We've found that on the IB storage systems we have in production -- as
well as under test -- we can easily saturate the controller with 4
OSSes. Each OSS is driving 7 OSTs -- 5 are needed to saturate bandwidth
-- and this has worked pretty well. It's not clear that having more
OSSes than needed brings a win, other than perhaps having more memory
available for a read cache. Adding more memory to the existing OSSes can
also achieve that, to a certain economic bound, and at larger scales it
isn't completely clear that the cache is a win -- we tend to blow
through it.

YMMV, of course.
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office




More information about the lustre-discuss mailing list